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PLANT BIOCHEMISTRY-RELATED GENES 

RELATED APPLICATION INFORMATION 

The present invention claims the benefit from US Provisional Patent Application Serial 

5 Nos. 60/166^228 filed November 17, 1999 and 60/197,899 filed April 17, 2000 and "Plant Trait 
Modification m M filed August 22, 2000. 

FIELD OF THE INVENTION 

This invention relates to the field of plant biology. More particularly, the present 

invention pertains to compositions and methods for phenotypically modifying a plant. 

1 0 BACKGROUND OF THE INVENTION 

Transcription factors can modulate gene expression, either increasing or decreasing 

(inducing or repressing) the rate of transcription. This modulation results in differential levels of 
gene expression at varipus developmental stages, in different tissues and cell types, and in 
response to different exogenous (e.g., environmental) and endogenous stimuli throughout the life 

1 5 cycle of the organism. 

Because transcription factors are key controlling elements of biological pathways, 
altering the expression levels of one or more transcription factors can change entire biological 
pathways in an organism. For example, manipulation of the levels of selected transcription 
factors may result in increased expression of economically useful proteins or metabolic chemicals 

20 in plants or to improve other agriculturally relevant characteristics. Conversely, blocked or 

reduced expression of a transcription factor may reduce biosynthesis of unwanted compounds or 
remove an undesirable trait. Therefore, manipulating transcription factor levels in a plant offers 
tremendous potential in agricultural biotechnology for modifying a plant's traits. 

The present invention provides novel transcription factors useful for modifying a plant's 

25 phenotype in desirable ways, such as modifying a plant's biochemical traits. 

SUMMARY OF THE INVENTION 

In a first aspect, the invention relates to a recombinant polynucleotide comprising a 

nucleotide sequence selected from the group consisting of: (a) a nucleotide sequence encoding a 
polypeptide comprising a sequence selected from SEQ ED Nos. 2N, where N=l-22, or a 
30 complementary nucleotide sequence thereof; (b) a nucleotide sequence encoding a polypeptide 
comprising a conservatively substituted variant of a polypeptide of (a); (c) a nucleotide sequence 
comprising a sequence selected from those of SEQ ID Nos. 2N-1, where N=l-22, or a 
complementary nucleotide sequence thereof; (d) a nucleotide sequence comprising silent 
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substitutions in a nucleotide sequence of (c); (e) a nucleotide sequence which hybridizes under 
stringent conditions over substantially the entire length of a nucleotide sequence of one or more 
of: (a), (b), (c), or (d); (f) a nucleotide sequence comprising at least 15 consecutive nucleotides of 
a sequence of any of (a)-(e); (g) a nucleotide sequence comprising a subsequence or fragment of 
5 any of (a)-(f), which subsequence or fragment encodes a polypeptide having a biological activity 
that modifies a plant's biochemical characteristic; (h) a nucleotide sequence having at least 31% 
sequence identity to a nucleotide sequence of any of (a)-(g); (i) a nucleotide sequence having at 
least 60% identity sequence identity to a nucleotide sequence of any of (a)-(g); (j) a nucleotide 
sequence which encodes a polypeptide having at least 31% identity sequence identity to a 

1 0 polypeptide of SEQ ID Nos. 2N, where N=l-22; (k) a nucleotide sequence which encodes a 

polypeptide having at least 60% identity sequence identity to a polypeptide of SEQ ID Nos. 2N, 
where N=l-22; and (1) a nucleotide sequence which encodes a conserved domain of a polypeptide 
having at least 65% sequence identity to a conserved domain of a polypeptide of SEQ ID Nos. 
2N, where N=l -22. The recombinant polynucleotide may further comprise a constitutive, 

1 5 inducible, or tissue-active promoter operably linked to the nucleotide sequence. The invention 
also relates to compositions comprising at least two of the above described polynucleotides. 

In a second aspect, the invention is an isolated or recombinant polypeptide comprising a 
subsequence of at least about 15 contiguous amino acids encoded by the recombinant or isolated 
polynucleotide described above. 

20 In another aspect, the invention is a transgenic plant comprising one or more of the above 

described recombinant polynucleotides. In yet another aspect, the invention is a plant with 
altered expression levels of a polynucleotide described above or a plant with altered expression or 
activity levels of an above described polypeptide. Further, the invention is a plant lacking a 
nucleotide sequence encoding a polypeptide described above. The plant may be a soybean, 

25 wheat, com, potato, cotton, rice, oilseed rape, sunflower, alfalfa, sugarcane, turf, banana, 

blackberry, blueberry, strawberry, raspberry, cantaloupe, carrot, cauliflower, coffee, cucumber, 
eggplant, grapes, honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, pineapple, 
spinach, squash, sweet corn, tobacco, tomato, watermelon, rosaceous fruits, or vegetable brassicas 
plant. 

30 In a further aspect, the invention relates to a cloning or expression vector comprising the 

isolated or recombinant polynucleotide described above or cells comprising the cloning or 
expression vector. 



2 



WO 01/36597 



PCT/US00/31344 



In yet a further aspect, the invention relates to a composition produced by incubating a 
polynucleotide of the invention with a nuclease, a restriction enzyme, a polymerase; a 
polymerase and a primer; a cloning vector, or with a cell. 

Furthermore, the invention relates to a method for producing a plant having a modified 
5 biochemical trait. The method comprises altering the expression of an isolated or recombinant 
polynucleotide of the invention or altering the expression or activity of a polypeptide of the 
invention in a plant to produce a modified plant, and selecting the modified plant for a modified 
biochemical trait. 

In another aspect, the invention relates to a method of identifying a factor that is 

10 modulated by or interacts with a polypeptide encoded by a polynucleotide of the invention. The 
method comprises expressing a polypeptide encoded by the polynucleotide in a plant; and 
identifying at least one factor that is modulated by or interacts with the polypeptide. In one 
embodiment the method for identifying modulating or interacting factors is by detecting binding 
by the polypeptide to a promoter sequence, or by detecting interactions between an additional 

1 5 protein and the polypeptide in a yeast two hybrid system, or by detecting expression of a factor by 
hybridization to a microarray, subtractive hybridization or differential display. 

In yet another aspect, the invention is a method of identifying a molecule that modulates 
activity or expression of a polynucleotide or polypeptide of interest. The method comprises 
placing the molecule in contact with a plant comprising the polynucleotide or polypeptide 

20 encoded by the polynucleotide of the invention and monitoring one or more of the expression 
level of the polynucleotide in the plant, the expression level of the polypeptide in the plant, and 
modulation of an activity of the polypeptide in the plant. 

In yet another aspect, the invention relates to an integrated system, computer or computer 
readable medium comprising one or more character strings corresponding to a polynucleotide of 

25 the invention, or to a polypeptide encoded by the polynucleotide. The integrated system, 
computer or computer readable medium may comprise a link between one or more sequence 
strings to a modified plant biochemical trait. 

In yet another aspect, the invention is a method for identifying a sequence similar or 
homologous to one or more polynucleotides of the invention, or one or more polypeptides 

30 encoded by the polynucleotides. The method comprises providing a sequence database; and, 
querying the sequence database with, one or more target sequences corresponding to the one or 
more polynucleotides or to the one or more polypeptides to identify one or more sequence 
members of the database that display sequence similarity or homology to one or more of the one 
or more target sequences. 



3 



WO 01/36597 



PCT7US00/31344 



The method may further comprise of linking the one or more of the polynucleotides of 
the invention, or encoded polypeptides, to a modified plant biochemical phenotype. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 provides a table of exemplary polynucleotide and polypeptide sequences of the 

5 invention. The table includes from left to right for each sequence: the SEQ ID No., the internal 
code reference number (GID), whether the sequence is a polynucleotide or polypeptide sequence, 
and identification of any conserved domains for the polypeptide sequences. 

Figure 2 provides a table of exemplary sequences that are homologous to other sequences 
provided in the Sequence Listing and that are derived from Arabidopsis thaliana. The table 
10 includes from left to right: the SEQ ID No., the internal code reference number (GID), 
identification of the homologous sequence, whether the sequence is a polynucleotide or 
polypeptide sequence, and identification of any conserved domains for the polypeptide 
sequences. 

Figure 3 provides a table of exemplary sequences that are homologous to the sequences 
15 provided in Figures 1 and 2 and that are derived from plants other than Arabidopsis thaliana. The 
table includes from left to right: the SEQ ID No., the internal code reference number (GID), the 
unique GenBank sequence ID No. (NID), the probability that the comparison was generated by 
chance (P-value), and the species from which the homologous gene was identified. 

20 DETAILED DESCRIPTION 

The present invention relates to polynucleotides and polypeptides, e.g. for modifying 

phenotypes of plants. 

In particular, the polynucleotides or polypeptides are useful for modifying traits 
associated with a plant's biochemical characteristic when the expression levels of the 

25 polynucleotides or expression levels or activity levels of the polypeptides are altered. 

The polynucleotides of the invention encode plant transcription factors. The plant 
transcription factors are derived, e.g., from Arabidopsis thaliana and can belong, e.g., to one or 
more of the following transcription factor families: the AP2 (APETALA2) domain transcription 
factor family (Riechmann and Meyerowitz (1998) J. Biol. Chem. 379:633-646); the MYB 

30 transcription factor family (Martin and Paz-Ares (1997) Trends Genet. 13:67-73); the MADS 

domain transcription factor family (Riechmann and Meyerowitz (1997) J. Biol. Chem 378:1079- 
1 101); the WRKY protein family (Ishiguro and Nakamura (1994) Mol. Gen. Genet. 244:563- 
571); the ankyrin-repeat protein family (Zhang et al. (1992) Plant Cell 4: 1575-1588); the 



4 



WO 01/36597 



PCT/US00/31344 



miscellaneous protein (MISC) family (Kim et al. (1997) Plant J. 11:1237-1251); the zinc finger 
protein (Z) family (Klug and Schwabe (1995) FASEB J. 9: 597-604); the homeobox (HB) protein 
family (Duboule (1994) Guidebook to the Homeobox Genes. Oxford University Press); the 
CAAT-element binding proteins (Forsburg and Guarente (1 989) Genes Dev. 3: 1 166-1 1 78); the 
5 squamosa promoter binding proteins (SPB) (Klein et al. (1996) Mol. Gen. Genet. 1996 250:7-16); 
the NAM protein family; the IAA/AUX proteins (Rouse et al (1998) Science 279:1371-1373); 
the HLH/MYC protein family (Littlewood et al. (1994) Prot. Profile 1 :639-709); the DNA- 
binding protein (DBP) family (Tucker et al. (1994) EMBOJ. 13:2994-3002); the bZBP family of 
transcription factors (Foster et al. (1994) FASEB J. 8:192-200); the BPF-1 protein (Box P- 

10 binding factor) family (da Costa e Silva et al. (1993) Plant J. 4:125-135); and the golden protein 
(GLD) family (Hall et al. (1998) Plant Cell 10:925-936). 

In addition to methods for modifying a plant phenotype by employing one or more 
polynucleotides and polypeptides of the invention described herein, the polynucleotides and 
polypeptides of the invention have a variety of additional uses. These uses include their use in 

1 5 the recombinant production (i.e, expression) of proteins; as regulators of plant gene expression, as 
diagnostic probes for the presence of complementary or partially complementary nucleic acids 
(including for detection of natural coding nucleic acids); as substrates for further reactions, e.g., 
mutation reactions, PCR reactions, or the like, of as substrates for cloning e.g., including 
digestion or ligation reactions, and for identifying exogenous or endogenous modulators of the 

20 transcription factors. 

DEFINITIONS 

A "polynucleotide" is a nucleic acid sequence comprising a plurality of polymerized 
nucleotide residues, e.g., at least about 15 consecutive polymerized nucleotide residues, 
optionally at least about 30 consecutive nucleotides, at least about 50 consecutive nucleotides. In 

25 many instances, a polynucleotide comprises a nucleotide sequence encoding a polypeptide (or 
protein) or a domain or fragment thereof. Additionally, the polynucleotide may comprise a 
promoter, an intron, an enhancer region, a polyadenylation site, a translation initiation site, 5* or 
3' untranslated regions, a reporter gene, a selectable marker, or the like. The polynucleotide can 
be single stranded or double stranded DNA or RNA. The polynucleotide optionally comprises 

30 modified bases or a modified backbone. The polynucleotide can be, e.g., genomic DNA or RNA, 
a transcript (such as an mRNA), a cDNA, a PCR product, a cloned DNA, a synthetic DNA or 
RNA, or the like. The polynucleotide can comprise a sequence in either sense or antisense 
orientations. 
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A "recombinant polynucleotide" is a polynucleotide that is not in its native state, e.g., the 
polynucleotide comprises a nucleotide sequence not found in nature, or the polynucleotide is in a 
context other than that in which it is naturally found, e.g., separated from nucleotide sequences 
with which it typically is in proximity in nature, or adjacent (or contiguous with) nucleotide 
5 sequences with which it typically is not in proximity. For example, the sequence at issue can be 
cloned into a vector, or otherwise recombined with one or more additional nucleic acid. 

An "isolated polynucleotide" is a polynucleotide whether naturally occurring or 
recombinant, that is present outside the cell in which it is typically found in nature, whether 
purified or not. Optionally, an isolated polynucleotide is subject to one or more enrichment or 

10 purification procedures, e.g., cell lysis, extraction, centrifiigation, precipitation, or the like. 

A "recombinant polypeptide" is a polypeptide produced by translation of a recombinant 
polynucleotide. An "isolated polypeptide," whether a naturally occurring or a recombinant 
polypeptide, is more enriched in (or out of) a cell than the polypeptide in its natural state in a wild 
type cell, e.g., more than about 5% enriched, more than about 10% enriched, or more than about 

1 5 20%, or more than about 50%, or more, enriched, i.e., alternatively denoted: 105%, 1 10%, 120%, 
150% or more, enriched relative to wild type standardized at 100%. Such an enrichment is not 
the result of a natural response of a wild type plant. Alternatively, or additionally, the isolated 
polypeptide is separated from other cellular components with which it is typically associated, e.g., 
by any of the various protein purification methods herein. 

20 The term "transgenic plant" refers to a plant that contains genetic material, not found in a 

wild type plant of the same species, variety or cultivar. The genetic material may include a 
transgene, an insertional mutagenesis event (such as by transposon or T-DNA insertional 
mutagenesis), an activation tagging sequence, a mutated sequence, a homologous recombination 
event or a sequence modified by chimeraplasty. Typically, the foreign genetic material has been 

25 introduced into the plant by human manipulation. 

A transgenic plant may contain an expression vector or cassette. The expression cassette 
typically comprises a polypeptide-encoding sequence operably linked (i.e., under regulatory 
control of) to appropriate inducible or constitutive regulatory sequences that allow for the 
expression of polypeptide. The expression cassette can be introduced into a plant by 

30 transformation or by breeding after transformation of a parent plant. A plant refers to a whole 
plant as well as to a plant part, such as seed, fruit, leaf, or root, plant tissue, plant cells or any 
other plant material, e.g., a plant explant, as well as to progeny thereof, and to in vitro systems 
that mimic biochemical or cellular components or processes in a cell. 
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The phrase "ectopically expression or altered expression" in reference to a polynucleotide 
indicates that the pattern of expression in, e.g., a transgenic plant or plant tissue, is different from 
the expression pattern in a wild type plant or a reference plant of the same species. For example, 
the polynucleotide or polypeptide is expressed in a cell or tissue type other than a cell or tissue 
5 type in which the sequence is expressed in the wild type plant, or by expression at a time other 
than at the time the sequence is expressed in the wild type plant, or by a response to different 
inducible agents, such as hormones or environmental signals, or at different expression levels 
(either higher or lower) compared with those found in a wild type plant. The term also refers to 
altered expression patterns that are produced by lowering the levels of expression to below the 

1 0 detection level or completely abolishing expression. The resulting expression pattern can be 
transient or stable, constitutive or inducible. In reference to a polypeptide, the term "ectopic 
expression or altered expression" fiirther may relate to altered activity levels resulting from the 
interactions of the polypeptides with exogenous or endogenous modulators or from interactions 
with factors or as a result of the chemical modification of the polypeptides. 

1 5 The term "fragment" or "domain," with respect to a polypeptide, refers to a subsequence 

of the polypeptide. In some cases, the fragment or domain, is a subsequence of the polypeptide 
which performs at least one biological function of the intact polypeptide in substantially the same 
manner, or to a similar extent, as does the intact polypeptide. For example, a polypeptide 
fragment can comprise a recognizable structural motif or functional domain such as a DNA 

20 binding domain that binds to a DNA promoter region, an activation domain or a domain for 

protein-protein interactions. Fragments can vary in size from as few as 6 amino acids to the full 
length of the intact polypeptide, but are preferably at least about 30 amino acids in length and 
more preferably at least about 60 amino acids in length. In reference to a nucleotide sequence, "a 
fragment" refers to any subsequence of a polynucleotide, typically, of at least consecutive about 

25 15 nucleotides, preferably at least about 30 nucleotides, more preferably at least about 50, of any 
of the sequences provided herein. 

The term "trait" refers to a physiological, morphological, biochemical or physical 
characteristic of a plant or particular plant material or cell. In some instances, this characteristic 
is visible to the human eye, such as seed or plant size, or can be measured by available 

30 biochemical techniques, such as the protein, starch or oil content of seed or leaves or by the 
observation of the expression level of genes, e.g., by employing Northern analysis, RT-PCR, 
microarray gene expression assays or reporter gene expression systems, or by agricultural 
observations such as stress tolerance, yield or pathogen tolerance. 
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'Trait modification" refers to a detectable difference in a characteristic in a plant 
ectopically expressing a polynucleotide or polypeptide of the present invention relative to a plant 
not doing so, such as a wild type plant. In some cases, the trait modification can be evaluated 
quantitatively. For example, the trait modification can entail at least about a 2% increase or 
5 decrease in an observed trait (difference), at least a 5% difference, at least about a 1 0% 

difference, at least about a 20% difference, at least about a 30%, at least about a 50%, at least 
about a 70%, or at least about a 100%, or an even greater difference. It is known that there can be 
a natural variation in the modified trait. Therefore, the trait modification observed entails a 
change of the normal distribution of the trait in the plants compared with the distribution 

1 0 observed in wild type plant. 

Trait modifications of particular interest include those to seed ( such as embryo or 
endosperm), fruit, root, flower, leaf, stem, shoot, seedling or the like, including: enhanced 
tolerance to environmental conditions including freezing, chilling, heat, drought, water saturation, 
radiation and ozone; improved tolerance to microbial, fungal or viral diseases; improved 

1 5 tolerance to pest infestations, including nematodes, mollicutes, parasitic higher plants or the like; 
decreased herbicide sensitivity; improved tolerance of heavy metals or enhanced ability to take up 
heavy metals; improved growth under poor photoconditions (e.g., low light and/or short day 
length), or changes in expression levels of genes of interest. Other phenotype that can be 
modified relate to the production of plant metabolites, such as variations in the production of 

20 taxol, tocopherol, tocotrienol, sterols, phytosterols, vitamins, wax monomers, antioxidants, 
amino acids, lignins, cellulose, tannins, prenyllipids (such as chlorophylls and carotenoids), 
glucosinolates, and terpenoids, enhanced or compositionally altered protein or oil production 
(especially in seeds), or modified sugar (insoluble or soluble) and/or starch composition. 
Physical plant characteristics that can be modified include cell development (such as the number 

25 of trichomes), fruit and seed size and number, yields of plant parts such as stems, leaves and 

roots, the stability of the seeds during storage, characteristics of the seed pod (e.g., susceptibility 
to shattering), root hair length and quantity, intemode distances, or the quality of seed coat. Plant 
growth characteristics that can be modified include growth rate, germination rate of seeds, vigor 
of plants and seedlings, leaf and flower senescence, male sterility, apomixis, flowering time, 

30 flower abscission, rate of nitrogen uptake, biomass or transpiration characteristics, as well as 

plant architecture characteristics such as apical dominance, branching patterns, number of organs, 
organ identity, organ shape or size. 
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POLYPEPTIDES AND POLYNUCLEOTIDES OF THE INVENTION 

The present invention provides, among other things, transcription factors (TFs), and 
transcription factor homologue polypeptides, and isolated or recombinant polynucleotides 
encoding the polypeptides. These polypeptides and polynucleotides may be employed to modify 
5 a plant's biochemical characteristic. 

Exemplary polynucleotides encoding the polypeptides of the invention were identified in 
the Arabidopsis thaliana GenBank database using publicly available sequence analysis programs 
and parameters. Sequences initially identified were then further characterized to identify 
sequences comprising specified sequence strings corresponding to sequence motifs present in 
1 0 families of known transcription factors. Polynucleotide sequences meeting such criteria were 
confirmed as transcription factors. 

Additional polynucleotides of the invention were identified by screening Arabidopsis 
thaliana and/or other plant cDNA libraries with probes corresponding to known transcription 
factors under low stringency hybridization conditions. Additional sequences, including full 
1 5 length coding sequences were subsequently recovered by the rapid amplification of cDNA ends 
(RACE) procedure, using a commercially available kit according to the manufacturer's 
instructions. Where necessary, multiple rounds of RACE are performed to isolate 5' and 3' ends. 
The full length cDNA was then recovered by a routine end-to-end polymerase chain reaction 
(PCR) using primers specific to the isolated 5' and 3' ends. Exemplary sequences are provided in 
20 the Sequence Listing. 

The polynucleotides of the invention were ectopically expressed in overexpressor or 
knockout plants and changes in the biochemical characteristics of the plants were observed. 
Therefore, the polynucleotides and polypeptides can be employed to improve the biochemical 
characteristics of plants; 

25 Making polynucleotides 

The polynucleotides of the invention include sequences that encode transcription factors 

and transcription factor homologue polypeptides and sequences complementary thereto, as well 
as unique fragments of coding sequence, or sequence complementary thereto. Such 
polynucleotides can be, e.g., DNA or RNA, e.g., mRNA, cRNA, synthetic RNA, genomic DNA, 
30 cDNA synthetic DNA, oligonucleotides, etc. The polynucleotides are either double-stranded or 
single-stranded, and include either, or both sense (i.e., coding) sequences and antisense (i.e., non- 
coding, complementary) sequences. The polynucleotides include the coding sequence of a 
transcription factor, or transcription factor homologue polypeptide, in isolation, in combination 
with additional coding sequences (e.g., a purification tag, a localization signal, as a fiision- 
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protein, as a pre-protein, or the like), in combination with non-coding sequences (e.g., introns or 
inteins, regulatory elements such as promoters, enhancers, terminators, and the like), and/or in a 
vector or host environment in which the polynucleotide encoding a transcription factor or 
transcription factor homologue polypeptide is an endogenous or exogenous gene. 
5 A variety of methods exist for producing the polynucleotides of the invention. Procedures 

for identifying and isolating DNA clones are well known to those of skill in the art, and are 
described in, e.g., Berger and Kimmel, Guide to Molecular Cloning Techniques. Methods in 
Enzvmology volume 152 Academic Press, Inc., San Diego, CA ("Berger"); Sambrook et al., 
Molecular Cloning - A Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, 

1 0 Cold Spring Harbor, New York, 1 989 ("Sambrook") and Current Protocols in Molec ular Biology, 
F.M. Ausubel et al., eds., Current Protocols, a joint venture between Greene Publishing 
Associates, Inc. and John Wiley & Sons, Inc., (supplemented through 2000) ("Ausubel"). 

Alternatively, polynucleotides of the invention, can be produced by a variety of in vitro 
amplification methods adapted to the present invention by appropriate selection of specific or 

1 5 degenerate primers. Examples of protocols sufficient to direct persons of skill through in vitro 
amplification methods, including the polymerase chain reaction (PCR) the ligase chain reaction 
(LCR), Qbeta-replicase amplification and other RNA polymerase mediated techniques (e.g., 
NASBA), e.g., for the production of the homologous nucleic acids of the invention are found in 
Berger, Sambrook, and Ausubel, as well as Mullis et al., (1987) PCR Protocols A Guide to 

20 Methods and Applications (Innis et al. eds) Academic Press Inc. San Diego, CA (1990) (Innis). 
Improved methods for cloning in vitro amplified nucleic acids are described in Wallace et al., 
U.S. Pat. No. 5,426,039. Improved methods for amplifying large nucleic acids by PCR are 
summarized in Cheng et al. (1994) Nature 369: 684-685 and the references cited therein, in which 
PCR amplicons of up to 40kb are generated. One of skill will appreciate that essentially any 

25 RNA can be converted into a double stranded DNA suitable for restriction digestion, PCR 
expansion and sequencing using reverse transcriptase and a polymerase. See, e.g., Ausubel, 
Sambrook and Berger, all supra. 

Alternatively, polynucleotides and oligonucleotides of the invention can be assembled 
from fragments produced by solid-phase synthesis methods. Typically, fragments of up to 

30 approximately 100 bases are individually synthesized and then enzymatically or chemically 
ligated to produce a desired sequence, e.g., a polynucletotide encoding all or part of a 
transcription factor. For example, chemical synthesis using the phosphoramidite method is 
described, e.g., by Beaucage et al. (1981) Tetrahedron Letters 22:1859-69; and Matthes et al. 
(1984) EMBO J. 3:801-5. According to such methods, oligonucleotides are synthesized, purified, 
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annealed to their complementary strand, ligated and then optionally cloned into suitable vectors. 
And if so desired, the polynucleotides and polypeptides of the invention can be custom ordered 
from any of a number of commercial suppliers. 

HOMOLOGOUS SEQUENCES 
5 Sequences homologous, i.e., that share significant sequence identity or similarity, to those 

provided in the Sequence Listing, derived from Arabidopsis thaliana or from other plants of 
choice are also an aspect of the invention. Homologous sequences can be derived from any plant 
including monocots and dicots and in particular agriculturally important plant species, including 
but not limited to, crops such as soybean, wheat, com, potato, cotton, rice, oilseed rape (including 

10 canola), sunflower, alfalfa, sugarcane and turf; or fruits and vegetables, such as banana, 
blackberry, blueberry, strawberry, and raspberry, cantaloupe, carrot, cauliflower, coffee, 
cucumber, eggplant, grapes, honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, 
pineapple, spinach, squash, sweet corn, tobacco, tomato, watermelon, rosaceous fruits (such as 
apple, peach, pear, cherry and plum) and vegetable brassicas (such as broccoli, cabbage, 

1 5 cauliflower, brussel sprouts and kohlrabi). Other crops, fruits and vegetables whose phenotype 
can be changed include barley, rye, millet, sorghum, currant, avocado, citrus fruits such as 
oranges, lemons, grapefruit and tangerines, artichoke, cherries, nuts such as the walnut and 
peanut, endive, leek, roots, such as arrowroot, beet, cassava, turnip, radish, yam, and sweet 
potato, and beans. The homologous sequences may also be derived from woody species, such 

20 pine, poplar and eucalyptus. 

Transcription factors that are homologous to the listed sequences will typically share at 
least about 30% amino acid sequence identity. More closely related transcription factors can 
share at least about 50%, about 60%, about 65%, about 70%, about 75% or about 80% or about 
90% or about 95% or about 98% or more sequence identity with the listed sequences. Factors 

25 that are most closely related to the listed sequences share, e.g., at least about 85%, about 90% or 
about 95% or more % sequence identity to the listed sequences. At the nucleotide level, the 
sequences will typically share at least about 40% nucleotide sequence identity, preferably at least 
about 50%, about 60%, about 70% or about 80% sequence identity, and more preferably about 
85%, about 90%, about 95% or about 97% or more sequence identity to one or more of the listed 

30 sequences. The degeneracy of the genetic code enables major variations in the nucleotide 

sequence of a polynucleotide while maintaining the amino acid sequence of the encoded protein. 
Conserved domains within a transcription factor family may exhibit a higher degree of sequence 
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homology, such as at least 65% sequence identity including conservative substitutions, and 
preferably at least 80% sequence identity. 

Identifying Nucleic Acids bv Hybridization 
Polynucleotides homologous to the sequences illustrated in the Sequence Listing can be 

5 identified, e.g., by hybridization to each other under stringent or under highly stringent 

conditions. Single stranded polynucleotides hybridize when they associate based on a variety of 

well characterized physico-chemical forces, such as hydrogen bonding, solvent exclusion, base 

stacking and the like. The stringency of a hybridization reflects the degree of sequence identity 

of the nucleic acids involved, such that the higher the stringency, the more similar are the two 

10 polynucleotide strands. Stringency is influenced by a variety of factors, including temperature, 
salt concentration and composition, organic and non-organic additives, solvents, etc. present in 
both the hybridization and wash solutions and incubations (and number), as described in more 
detail in the references cited above. 

An example of stringent hybridization conditions for hybridization of complementary 

15 nucleic acids which have more than 100 complementary residues on a filter in a Southern or 
northern blot is about 5°C to 20°C lower than the thermal melting point (Tm) for the specific 
sequence at a defined ionic strength and pH. The T m is the temperature (under defined ionic 
strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. 
Nucleic acid molecules that hybridize under stringent conditions will typically hybridize to a 

20 probe based on either the entire cDNA or selected portions, e.g., to a unique subsequence, of the 
cDNA under wash conditions of 0.2x SSC to 2.0 x SSC, 0. 1% SDS at 50-65° C, for example 0.2 
x SSC, 0.1% SDS at 65° C. For identification of less closely related homologues washes can be 
performed at a lower temperature, e.g., 50° C. In general, stringency is increased by raising the 
wash temperature and/or decreasing the concentration of SSC. 

25 As another example, stringent conditions can be selected such that an oligonucleotide that 

is perfectly complementary to the coding oligonucleotide hybridizes to the coding oligonucleotide 
with at least about a 5-10x higher signal to noise ratio than the ratio for hybridization of the 
perfectly complementary oligonucleotide to a nucleic acid encoding a transcription factor known 
as of the filing date of the application. Conditions can be selected such that a higher signal to 

30 noise ratio is observed in the particular assay which is used, e.g., about 15x, 25x, 35x, 50x or 

more. Accordingly, the subject nucleic acid hybridizes to the unique coding oligonucleotide with 
at least a 2x higher signal to noise ratio as compared to hybridization of the coding 
oligonucleotide to a nucleic acid encoding known polypeptide. Again, higher signal to noise 
ratios can be selected, e.g., about 5x, lOx, 25x, 35x, 50x or more. The particular signal will 
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depend on the label used in the relevant assay, e.g., a fluorescent label, a colorimetric label, a 
radioactive label, or the like. 

Alternatively, transcription factor homologue polypeptides can be obtained by screening 
an expression library using antibodies specific for one or more transcription factors. With the 
5 provision herein of the disclosed transcription factor, and transcription factor homologue nucleic 
acid sequences, the encoded polypeptide(s) can be expressed and purified in a heterologous 
expression system (e.g., E. coli) and used to raise antibodies (monoclonal or polyclonal) specific 
for the polypeptide(s) in question. Antibodies can also be raised against synthetic peptides 
derived from transcription factor, or transcription factor homologue, amino acid sequences. 
1 0 Methods of raising antibodies are well known in the art and are described in Harlow and Lane 
(1988) Antibodies: A Laboratory Manual Cold Spring Harbor Laboratory, New York. Such 
antibodies can then be used to screen an expression library produced from the plant from which it 
is desired to clone additional transcription factor homologues, using the methods described above. 
The selected cDNAs can be confirmed by sequencing and enzymatic activity. 

15 SEQUENCE VARIATIONS 

It will readily be appreciated by those of skill in the art, that any of a variety of 
polynucleotide sequences are capable of encoding the transcription factors and transcription 
factor homologue polypeptides of the invention. Due to the degeneracy of the genetic code, 
many different polynucleotides can encode identical and/or substantially similar polypeptides in 

20 addition to those sequences illustrated in the Sequence Listing. 

For example, Table 1 illustrates, e.g., that the codons AGC, AGT, TCA, TCC, TCG, and 
TCT all encode the same amino acid: serine. Accordingly, at each position in the sequence where 
there is a codon encoding serine, any of the above trinucleotide sequences can be used without 
altering the encoded polypeptide. 
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Table 1 



Amino acids 


Codon 


rviuninc 


Ala 


A 




GCC 


GCG 


GCU 








Cv<; 


c 


TGC 


TGT 










/VSpaniC dLIU 


Asp 


n 

u 


GAP 


GAT 










oiuiarnic acia 


VJIU 


F 


GAA 


GAG 










r I icily la itii in it 


Phe 

I 11V 


F 


TTC 


TTT 










vjiycine 




g 


noA 


GGC 


GGG 


GGT 






nisuQinc 


Hie 
nib 


H 
li 


tap 


CAT 










Ic/i1*»iipini» 
JoUlCUl/lilC 


He 


i 


ATA 


ATC 


ATT 








Lvsine 


Lvs 


K 


AAA 


AAG 










Leucine 


Leu 


L 


TTA 


TTG 


CTA 


CTC 


CTG 


CTT 


Methionine 


Met 


M 


ATG 












Asparagine 


Asn 


N 


AAC 


AAT 










Proline 


Pro 


P 


CCA 


CCC 


CCG 


CCT 






Glutamine 


Gin 


Q 


CAA 


CAG 










Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


CGG 


CGT 


Serine 


Ser 


S 


AGC 


AGT 


TCA 


TCC 


TCG 


TCT 


Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACT 






Valine 


Val 


V 


GTA 


GTC 


GTG 


GTT 






Tryptophan 


Trp 


W 


TGG 












Tyrosine 


Tyr 


Y 


TAC 


TAT 











Sequence alterations that do not change the amino acid sequence encoded by the 
5 polynucleotide are termed "silent" variations. With the exception of the codons ATG and TGG, 
encoding methionine and tryptophan, respectively, any of the possible codons for the same amino 
acid can be substituted by a variety of techniques, e.g., site-directed mutagenesis, available in the 
art. Accordingly, any and all such variations of a sequence selected from the above table are a 
feature of the invention. 

10 In addition to silent variations, other conservative variations that alter one, or a few 

amino acids in the encoded polypeptide, can be made without altering the function of the 
polypeptide, these conservative variants are, likewise, a feature of the invention. 

For example, substitutions, deletions and insertions introduced into the sequences 
provided in the Sequence Listing are also envisioned by the invention. Such sequence 

1 5 modifications can be engineered into a sequence by site-directed mutagenesis (Wu (ed.) Meth. 
Enzvmol . (1993) vol. 217, Academic Press) or the other methods noted below. Amino acid 
substitutions are typically of single residues; insertions usually will be on the order of about from 
1 to 10 amino acid residues; and deletions will range about from 1 to 30 residues. In preferred 
embodiments, deletions or insertions are made in adjacent pairs, e.g., a deletion of two residues or 

20 insertion of two residues. Substitutions, deletions, insertions or any combination thereof can be 
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combined to arrive at a sequence. The mutations that are made in the polynucleotide encoding the 
transcription factor should not place the sequence out of reading frame and should not create 
complementary regions that could produce secondary mRNA structure. Preferably, the 
polypeptide encoded by the DNA performs the desired function. 

Conservative substitutions are those in which at least one residue in the amino acid 
sequence has been removed and a different residue inserted in its place. Such substitutions 
generally are made in accordance with the Table 2 when it is desired to maintain the activity of 
the protein. Table 2 shows amino acids which can be substituted for an amino acid in a protein 
and which are typically regarded as conservative substitutions. 

Table 2 



Residue Conservative 

Substitutions 



Ala 


Ser 


Arg 


Lys 


Asn 


Gin; His 


Asp 


Glu 


Gin 


Asn 


Cys 


Ser 


Glu 


Asp 


Gly 


Pro 


His 


Asn; Gin 


lie 


Leu, Val 


Leu 


lie; Val 


Lys 


Arg; Gin 


Met 


Leu; lie 


Phe 


Met; Leu; Tyr 


Ser 


Thr; Gly 


Thr 


Ser;Val 


Trp 


Tyr 


Tyr 


Trp; Phe 


Val 


De; Leu 
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Substitutions that are less conservative than those in Table 2 can be selected by picking 
residues that differ more significantly in their effect on maintaining (a) the structure of the 
polypeptide backbone in the area of the substitution, for example, as a sheet or helical 
conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of 
5 the side chain. The substitutions which in general are expected to produce the greatest changes in 
protein properties will be those in which (a) a hydrophilic residue, e.g., seryl or threonyl, is 
substituted for (or by) a hydrophobic residue, e.g., leucyl, isoleucyl, phenylalanyl, valyl or alanyl; 
(b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue having an 
electropositive side chain, e.g., lysyl, arginyl, or histidyl, is substituted for (or by) an 
10 electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side chain, e.g., 
phenylalanine, is substituted for (or by) one not having a side chain, e.g., glycine. 

FURTHER MODIFYING SEQUENCES OF THE INVENTION— MUTATION/ 
FORCED EVOLUTION ^"^ 

In addition to generating silent or conservative substitutions as noted, above, the present 

1 5 invention optionally includes methods of modifying the sequences of the Sequence Listing. In 
the methods, nucleic acid or protein modification methods are used to alter the given sequences to 
produce new sequences and/or to chemically or enzymatically modify given sequences to change 
the properties of the nucleic acids or proteins. 

Thus, in one embodiment, given nucleic acid sequences are modified, e.g., according to 

20 standard mutagenesis or artificial evolution methods to produce modified sequences. For 

example, Ausubel, supra, provides additional details on mutagenesis methods. Artificial forced 
evolution methods are described, e.g., by Stemmer (1994) Nature 370:389-391, and Stemmer 
(1994) Proc. Natl. Acad. Sci. USA 91:10747-10751. Many other mutation and evolution methods 
are also available and expected to be within the skill of the practitioner. 

25 Similarly, chemical or enzymatic alteration of expressed nucleic acids and polypeptides 

can be performed by standard methods. For example, sequence can be modified by addition of 
lipids, sugars, peptides, organic or inorganic compounds, by the inclusion of modified nucleotides 
or amino acids, or the like. For example, protein modification techniques are illustrated in 
Ausubel, supra. Further details on chemical and enzymatic modifications can be found herein. 

30 These modification methods can be used to modify any given sequence, or to modify any 

sequence produced by the various mutation and artificial evolution modification methods noted 
herein. 

Accordingly, the invention provides for modification of any given nucleic acid by 
mutation, evolution, chemical or enzymatic modification, or other available methods, as well as 
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for the products produced by practicing such methods, e.g., using the sequences herein as a 
starting substrate for the various modification approaches. 

For example, optimized coding sequence containing codons preferred by a particular 
prokaryotic or eukaryotic host can be used e.g., to increase the rate of translation or to produce 
5 recombinant RNA transcripts having desirable properties, such as a longer half-life, as compared 
with transcripts produced using a non-optimized sequence. Translation stop codons can also be 
modified to reflect host preference. For example, preferred stop codons for S. cerevisiae and 
mammals are TAA and TGA, respectively. The preferred stop codon for monocotyledonous 
plants is TGA, whereas insects and E. coli prefer to use TAA as the stop codon. 

10 The polynucleotide sequences of the present invention can also be engineered in order to 

alter a coding sequence for a variety of reasons, including but not limited to, alterations which 
modify the sequence to facilitate cloning, processing and/or expression of the gene product. For 
example, alterations are optionally introduced using techniques which are well known in the art, 
e.g., site-directed mutagenesis, to insert new restriction sites, to alter glycosylation patterns, to 

1 5 change codon preference, to introduce splice sites, etc. 

Furthermore, a fragment or domain derived from any of the polypeptides of the invention 
can be combined with domains derived from other transcription factors or synthetic domains to 
modify the biological activity of a transcription factor. For instance, a DNA binding domain 
derived from a transcription factor of the invention can be combined with the activation domain 

20 of another transcription factor or with a synthetic activation domain. A transcription activation 
domain assists in initiating transcription from a DNA binding site. Examples include the 
transcription activation region of VP16 or GAL4 (Moore et al. (1998) Proc. Natl. Acad: Sci. USA 
95: 376-381; and Aoyama et al. (1995) Plant Cell 7:1773-1785), peptides derived from bacterial 
sequences (Ma and Ptashne (1987) Cell 5 1 ; 1 13-1 19) and synthetic peptides (Giniger and 

25 Ptashne, (1987) Nature 330:670-672). 

EXPRESSION AND MODIFICATION OF POLYPEPTIDES 

Typically, polynucleotide sequences of the invention are incorporated into recombinant 

DNA (or RNA) molecules that direct expression of polypeptides of the invention in appropriate 

host cells, transgenic plants, in vitro translation systems, or the like. Due to the inherent 
30 degeneracy of the genetic code, nucleic acid sequences which encode substantially the same or a 

functionally equivalent amino acid sequence can be substituted for any listed sequence to provide 

for cloning and expressing the relevant homologue. 
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Vectors. Promoters and E xpression Systems 
The present invention includes recombinant constructs comprising one or more of the 

nucleic acid sequences herein. The constructs typically comprise a vector, such as a plasmid, a 

cosmid, a phage, a virus (e.g., a plant virus), a bacterial artificial chromosome (BAC), a yeast 

5 artificial chromosome (YAC), or the like, into which a nucleic acid sequence of the invention has 

been inserted, in a forward or reverse orientation. In a preferred aspect of this embodiment, the 

construct further comprises regulatory sequences, including, for example, a promoter, operably 

linked to the sequence. Large numbers of suitable vectors and promoters are known to those of 

skill in the art, and are commercially available. 

1 0 General texts which describe molecular biological techniques useful herein, including the 

use and production of vectors, promoters and many other relevant topics, include Berger, 
Sambrook and Ausubel, Supra. Any of the identified sequences can be incorporated into a cassette 
or vector, e.g., for expression in plants. A number of expression vectors suitable for stable 
transformation of plant cells or for the establishment of transgenic plants have been described 

1 5 including those described in Weissbach and Weissbach, (1989; Methods for Plant Molecular 
Biology , Academic Press, and Gelvin et al., (1990) Plant Molecular Biology Manual, Kluwer 
Academic Publishers. Specific examples include those derived from a Ti plasmid of 
Agrobacterium tumefaciens, as well as those disclosed by Herrera-Estrella et al. (1983) Nature 
303: 209, Bevan (1984) Nucl Acid Res. 12: 871 1-8721, Klee (1985) Bio/Technology 3: 637-642, 

20 for dicotyledonous plants. 

Alternatively, non-Ti vectors can be used to transfer the DNA into monocotyledonous 
plants and cells by using free DNA delivery techniques. Such methods can involve, for example, 
the use of liposomes, electroporation, microprojectile bombardment, silicon carbide whiskers, and 
viruses. By using these methods transgenic plants such as wheat, rice (Christou (1991) 

25 Bio/Technology 9: 957-962) and corn (Gordon-Kamm (1990) Plant Cell 2: 603-618) can be 
produced. An immature embryo can also be a good target tissue for monocots for direct DNA 
delivery techniques by using the particle gun (Weeks et al. (1993) Plant Physiol 102: 1077-1084; 
Vasil (1993) Bio/Technology 10: 667-674; Wan and Lemeaux (1994) Plant Physiol 104: 37-48, 
and for Agrobacterium-mediated DNA transfer (Ishida et al. (1996) Nature Biotech 14: 745-750). 

30 Typically, plant transformation vectors include one or more cloned plant coding sequence 

(genomic or cDNA) under the transcriptional control of 5' and 3' regulatory sequences and a 
dominant selectable marker. Such plant transformation vectors typically also contain a promoter 
(e.g., a regulatory region controlling inducible or constitutive, environmentally-or 
developmentally-regulated, or cell- or tissue-specific expression), a transcription initiation start 
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site, an RNA processing signal (such as intron splice sites), a transcription termination site, and/or 
a polyadenylation signal. 

Examples of constitutive plant promoters which can be useful for expressing the TF 
sequence include: the cauliflower mosaic virus (CaMV) 35S promoter, which confers 
5 constitutive, high-level expression in most plant tissues (see, e.g., Odel et al. (1985) Nature 
313:810); the nopaline synthase promoter (An et al. (1988) Plant Physiol 88:547); and the 
octopine synthase promoter (Fromm et al. (1989) Plant Cell 1: 977). 

A variety of plant gene promoters that regulate gene expression in response to 
environmental, hormonal, chemical, developmental signals, and in a tissue-active manner can be 

10 used for expression of a TF sequence in plants. Choice of a promoter is based largely on the 
phenotype of interest and is determined by such factors as tissue (e.g., seed, fruit, root, pollen, 
vascular tissue, flower, carpel, etc.), inducibility (e.g., in response to wounding, heat, cold, 
drought, light, pathogens, etc.), timing, developmental stage, and the like. Numerous known 
promoters have been characterized and can favorable be employed to promote expression of a 

15 polynucleotide of the invention in a transgenic plant or cell of interest. For example, tissue 
specific promoters include: seed-specific promoters (such as the napin, phaseolin or DC3 
promoter described in US Pat. No. 5,773,697), fruit-specific promoters that are active during fruit 
ripening (such as the dru 1 promoter (US Pat. No. 5,783,393), or the 2A1 1 promoter (US Pat. No. 
4,943,674) and the tomato polygalacturonase promoter (Bird et al. (1988) Plant Mol Biol 1 1:65 1), 

20 root-specific promoters, such as those disclosed in US Patent Nos. 5,618,988, 5,837,848 and 

5,905,186, pollen-active promoters such as PTA29, PTA26 and PTA13 (US Pat No. 5,792,929), 
promoters active in vascular tissue (Ringli and Keller (1998) Plant Mol Biol 37:977-988), flower- 
specific (Kaiser et al, (1995) Plant Mol Biol 28:231-243), pollen (Baerson et al. (1994) Plant Mol 
Biol 26:1947-1959), carpels (Ohl et al. (1990) Plant Cell 2:837-848), pollen and ovules (Baerson 

25 et al. (1993) Plant Mol Biol 22:255-267), auxin-inducible promoters (such as that described in 
van der Kop et al. (1999) Plant Mol Biol 39:979-990 or Baumann et al. (1999) Plant Cell 1 1:323- 
334), cytokinin-inducible promoter (Guevara-Garcia (1998) Plant Mol Biol 38:743-753), 
promoters responsive to gibberellin (Shi et al. (1998) Plant Mol Biol 38:1053-1060, Willmott et 
al. (1998) 38:817-825) and the like. Additional promoters are those that elicit expression in 

30 response to heat (Ainley et al. (1993) Plant Mol Biol 22: 13-23), light (e.g., the pea rbcS-3A 

promoter, Kuhlemeier et al. (1989) Plant Cell 1:471, and the maize rbcS promoter, Schaffher and 
Sheen (1991) Plant Cell 3: 997); wounding (e.g., wunl, Siebertz et al. (1989) Plant Cell 1: 961); 
pathogens (such as the PR-1 promoter described in Buchel et al. (1999) Plant Mol. Biol. 40:387- 
396, and the PDF1.2 promoter described in Manners et al. (1998) Plant Mol. Biol. 38:1071-80), 
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and chemicals such as methyl jasmonate or salicylic acid (Gate et al. (1997) Plant Mol Biol 48: 89- 
108). In addition, the timing of the expression can be controlled by using promoters such as those 
acting at senescence (An and Amazon (1995) Science 270: 1986-1988); or late seed development 
(Odell et al (1994) Plant Physiol 106:447-458). 
5 Plant expression vectors can also include RNA processing signals that can be positioned 

within, upstream or downstream of the coding sequence. In addition, the expression vectors can 
include additional regulatory sequences from the 3-untranslated region of plant genes, e.g., a 3 f 
terminator region to increase mRNA stability of the mRNA, such as the PI-II terminator region of 
potato or the octopine or nopaline synthase 3 f terminator regions. 

10 Additional Expression Elements 

Specific initiation signals can aid in efficient translation of coding sequences. These 

signals can include, e.g., the ATG initiation codon and adjacent sequences. In cases where a 

coding sequence, its initiation codon and upstream sequences are inserted into the appropriate 

expression vector, no additional translational control signals may be needed. However, in cases 

1 5 where only coding sequence (e.g., a mature protein coding sequence), or a portion thereof, is 
inserted, exogenous transcriptional control signals including the ATG initiation codon can be 
separately provided. The initiation codon is provided in the correct reading frame to facilitate 
transcription. Exogenous transcriptional elements and initiation codons can be of various origins, 
both natural and synthetic. The efficiency of expression can be enhanced by the inclusion of 

20 enhancers appropriate to the cell system in use. 

Expression Hosts 

The present invention also relates to host cells which are transduced with vectors of the 
invention, and the production of polypeptides of the invention (including fragments thereof) by 
recombinant techniques. Host cells are genetically engineered (i.e, nucleic acids are introduced, 

25 e.g., transduced, transformed or transfected) with the vectors of this invention, which may be, for 
example, a cloning vector or an expression vector comprising the relevant nucleic acids herein. 
The vector is optionally a plasmid, a viral particle, a phage, a naked nucleic acids, etc. The 
engineered host cells can be cultured in conventional nutrient media modified as appropriate for 
activating promoters, selecting transformants, or amplifying the relevant gene. The culture 

30 conditions, such as temperature, pH and the like, are those previously used with the host cell 

selected for expression, and will be apparent to those skilled in the art and in the references cited 
herein, including, Sambrook and Ausubel. 

The host cell can be a eukaryotic cell, such as a yeast cell, or a plant cell, or the host cell 
can be a prokaryotic cell, such as a bacterial cell. Plant protoplasts are also suitable for some 
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applications. For example, the DNA fragments are introduced into plant tissues, cultured plant 
cells or plant protoplasts by standard methods including electroporation (Fromm et al., (1985) 
Proc. Natl Acad. Sci. USA 82, 5824, infection by viral vectors such as cauliflower mosaic virus 
(CaMV) (Hohn et al., (1982) Molecular Biology of Plant Tumors. (Academic Press, New York) 
5 pp. 549-560; US 4,407,956), high velocity ballistic penetration by small particles with the nucleic 
acid either within the matrix of small beads or particles, or on the surface (Klein et al., (1987) 
Nature 327. 70-73), use of pollen as vector (WO 85/01856), or use of Agrobacterium tumefaciens 
or A. rhizogenes carrying a T-DNA plasmid in which DNA fragments are cloned. The T-DNA 
plasmid is transmitted to plant cells upon infection by Agrobacterium tumefaciens, and a portion 

10 is stably integrated into the plant genome (Horsch et al. (1984) Science 233:496-498; Fraley et al. 
(1983) Proc. Natl. Acad. Sci. USA 80, 4803). 

The cell can include a nucleic acid of the invention which encodes a polypeptide, wherein 
the cells expresses a polypeptide of the invention. The cell can also include vector sequences, or 
the like.. Furthermore, cells and transgenic plants which include any polypeptide or nucleic acid 

1 5 above or throughout this specification, e.g., produced by transduction of a vector of the invention, 
are an additional feature of the invention. 

For long-term, high-yield production of recombinant proteins, stable expression can be 
used. Host cells transformed with a nucleotide sequence encoding a polypeptide of the invention 
are optionally cultured under conditions suitable for the expression and recovery of the encoded 

20 protein from cell culture. The protein or fragment thereof produced by a recombinant cell may be 
secreted, membrane-bound, or contained intracellularly, depending on the sequence and/or the 
vector used. As will be understood by those of skill in the art, expression vectors containing 
polynucleotides encoding mature proteins of the invention can be designed with signal sequences 
which direct secretion of the mature polypeptides through a prokaryotic or eukaryotic cell 

25 membrane. 

Modified Amino Acids 
Polypeptides of the invention may contain one or more modified amino acids. The 

presence of modified amino acids may be advantageous in, for example, increasing polypeptide 

half-life, reducing polypeptide antigenicity or toxicity, increasing polypeptide storage stability, or 

30 the like. Amino acid(s) are modified, for example, co-translationally or post-translationally 

during recombinant production or modified by synthetic or chemical means. 

Non-limiting examples of a modified amino acid include incorporation or other use of 

acetylated amino acids, glycosylated amino acids, sulfated amino acids, prenylated (e.g., 

famesylated, geranylgeranylated) amino acids, PEG modified (e.g., "PEGylated") amino acids, 
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biotinylated amino acids, carboxylated amino acids, phosphorylated amino acids, etc. References 
adequate to guide one of skill in the modification of amino acids are replete throughout the 
literature. 

IDENTIFICATION OF ADDITIONAL FACTORS 

5 A transcription factor provided by the present invention can also be used to identify 

additional endogenous or exogenous molecules that can affect a phentoype or trait of interest. On 
the one hand, such molecules include organic (small or large molecules) and/or inorganic 
compounds that affect expression of (i.e., regulate) a particular transcription factor. 
Alternatively, such molecules include endogenous molecules that are acted upon either at a 

1 0 transcriptional level by a transcription factor of the invention to modify a phenotype as desired. 
For example, the transcription factors can be employed to identify one or more downstream gene 
with which is subject to a regulatory effect of the transcription factor. In one approach, a 
transcription factor or transcription factor homologue of the invention is expressed in a host cell, 
e.g, a transgenic plant cell, tissue or explant, and expression products, either RNA or protein, of 

1 5 likely or random targets are monitored, e.g., by hybridization to a microanay of nucleic acid 
probes corresponding to genes expressed in a tissue or cell type of interest, by two-dimensional 
gel electrophoresis of protein products, or by any other method known in the art for assessing 
expression of gene products at the level of RNA or protein. Alternatively, a transcription factor 
of the invention can be used to identify promoter sequences (i.e., binding sites) involved in the 

20 regulation of a downstream target After identifying a promoter sequence, interactions between 
the transcription factor and the promoter sequence can be modified by changing specific 
nucleotides in the promoter sequence or specific amino acids in the transcription factor that 
interact with the promoter sequence to alter a plant trait. Typically, transcription factor DNA 
binding sites are identified by gel shift assays. After identifying the promoter regions, the 

25 promoter region sequences can be employed in double-stranded DNA arrays to identify 

molecules that affect the interactions of the transcription factors with their promoters (Bulyk et al. 
(1 QQ<n Nature Biotechnology 17:573-577). 

The identified transcription factors are also useful to identify proteins that modify the 
activity of the transcription factor. Such modification can occur by covalent modification, such 

30 as by phosphorylation, or by protein-protein (homo or-heteropolymer) interactions. Any method 
suitable for detecting protein-protein interactions can be employed. Among the methods that can 
be employed are co-immunoprecipitation, cross-linking and co-purification through gradients or 
chromatographic columns, and the two-hybrid yeast system. 



22 



WO 01/36597 



PCT/US00/31344 



The two-hybrid system detects protein interactions in vivo and is described in Chien, et 
al., (1991), Proc. Natl. Acad. Sci. USA 88, 9578-9582 and is commercially available from 
Clontech (Palo Alto, Calif.). In such a system, plasmids are constructed that encode two hybrid 
proteins: one consists of the DNA-binding domain of a transcription activator protein fused to the 
5 TF polypeptide and the other consists of the transcription activator protein's activation domain 
fused to an unknown protein that is encoded by a cDNA that has been recombined into the 
plasmid as part of a cDNA library. The DNA-binding domain fusion plasmid and the cDNA 
library are transformed into a strain of the yeast Saccharomyces cerevisiae that contains a reporter 
gene (e.g., lacZ) whose regulatory region contains the transcription activator's binding site. Either 

1 0 hybrid protein alone cannot activate transcription of the reporter gene. Interaction of the two 
hybrid proteins reconstitutes the functional activator protein and results in expression of the 
reporter gene, which is detected by an assay for the reporter gene product. Then, the library 
plasmids responsible for reporter gene expression are isolated and sequenced to identify the 
proteins encoded by the library plasmids. After identifying proteins that interact with the 

1 5 transcription factors, assays for compounds that interfere with the TF protein-protein interactions 
can be preformed. 

IDENTIFICATION OF MODULATORS 

In addition to the intracellular molecules described above, extracellular molecules that 
alter activity or expression of a transcription factor, either directly or indirectly, can be identified. 

20 For example, the methods can entail first placing a candidate molecule in contact with a plant or 
plant cell. The molecule can be introduced by topical administration, such as spraying or soaking 
of a plant, and then the molecule's effect on the expression or activity of the TF polypeptide or 
the expression of the polynucleotide monitored. Changes in the expression of the TF polypeptide 
can be monitored by use of polyclonal or monoclonal antibodies, gel electrophoresis or the like. 

25 Changes in the expression of the corresponding polynucleotide sequence can be detected by use 
of microarrays, Northerns, quantitative PCR, or any other technique for monitoring changes in 
mRNA expression. These techniques are exemplified in Ausubel et al. (eds) Current Protocols in 
Molecular Biology . John Wiley & Sons (1998). Such changes in the expression levels can be 
correlated with modified plant traits and thus identified molecules can be useful for soaking or 

30 spraying on fruit, vegetable and grain crops to modify traits in plants. 

Essentially any available composition can be tested for modulatory activity of expression 
or activity of any nucleic acid or polypeptide herein. Thus, available libraries of compounds such 
as chemicals, polypeptides, nucleic acids and the like can be tested for modulatory activity. 



23 



WO 01/36597 



PCTYUS00/31344 



Often, potential modulator compounds can be dissolved in aqueous or organic (e.g., DMSO- 
based) solutions for easy delivery to the cell or plant of interest in which the activity of the 
modulator is to be tested. Optionally, the assays are designed to screen large modulator 
composition libraries by automating the assay steps and providing compounds from any 
5 convenient source to assays, which are typically run in parallel (e.g., in microtiter formats on 
microtiter plates in robotic assays). 

In one embodiment, high throughput screening methods involve providing a 
combinatorial library containing a large number of potential compounds (potential modulator 
compounds). Such "combinatorial chemical libraries" are then screened in one or more assays, as 
1 0 described herein, to identify those library members (particular chemical species or subclasses) 
that display a desired characteristic activity. The compounds thus identified can serve as target 
compounds. 

A combinatorial chemical library can be, e.g., a collection of diverse chemical 
compounds generated by chemical synthesis or biological synthesis. For example, a 

1 5 combinatorial chemical library such as a polypeptide library is formed by combining a set of 
chemical building blocks (e.g., in one example, amino acids) in every possible way for a given 
compound length (i.e., the number of amino acids in a polypeptide compound of a set length). 
Exemplary libraries include peptide libraries, nucleic acid libraries, antibody libraries (see, e.g., 
Vaughn et al. (1996) Nature Biotechnology . 14(3):309-314 and PCT/US96/10287), carbohydrate 

20 libraries (see, e.g., Liang et al. Science (1996) 274:1520-1522 and U.S. Patent 5,593,853), 
peptide nucleic acid libraries (see, e.g., U.S. Patent 5,539,083), and small organic molecule 
libraries (see, e.g., benzodiazepines, Baum C&EN Jan 18, page 33 (1993); isoprenoids, U.S. 
Patent 5,569,588; thiazolidinones and metathiazanones, U.S. Patent 5,549,974; pyrrolidines, U.S. 
Patents 5,525,735 and 5,519,134; morpholino compounds, U.S. Patent 5,506,337) and the like. 

25 Preparation and screening of combinatorial or other libraries is well known to those of 

skill in the art. Such combinatorial chemical libraries include, but are not limited to, peptide 
libraries (see, e.g., U.S. Patent 5,010,175, Furka, Int. J. Pent. Prot. Res. 37:487-493 (1991) and 
Houghton et al. Nature 354:84-88 (1991)). Other chemistries for generating chemical diversity 
libraries can also be used. 

30 In addition, as noted, compound screening equipment for high-throughput screening is 

generally available, e.g., using any of a number of well known robotic systems that have also 
been developed for solution phase chemistries useful' in assay systems. These systems include 
automated workstations including an automated synthesis apparatus and robotic systems utilizing 
robotic arms. Any of the above devices are suitable for use with the present invention, e.g., for 
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high-throughput screening of potential modulators. The nature and implementation of 
modifications to these devices (if any) so that they can operate as discussed herein will be 
apparent to persons skilled in the relevant art. 

Indeed, entire high throughput screening systems are commercially available. These 
5 systems typically automate entire procedures including all sample and reagent pipetting, liquid 
dispensing, timed incubations, and final readings of the microplate in detector(s) appropriate for 
the assay. These configurable systems provide high throughput and rapid start up as well as a 
high degree of flexibility and customization. Similarly, microfluidic implementations of 
screening are also commercially available. 

10 The manufacturers of such systems provide detailed protocols the various high 

throughput. Thus, for example, Zymark Corp. provides technical bulletins describing screening 
systems for detecting the modulation of gene transcription, ligand binding, and the like. The 
integrated systems herein, in addition to providing for sequence alignment and, optionally, 
synthesis of relevant nucleic acids, can include such screening apparatus to identify modulators 

1 5 that have an effect on one or more polynucleotides or polypeptides according to the present 
invention. 

In some assays it is desirable to have positive controls to ensure that the components of 
the assays are working properly. At least two types of positive controls are appropriate. That is, 
known transcriptional activators or inhibitors can be incubated with cells/plants/ etc. in one 

20 sample of the assay, and the resulting increase/decrease in transcription can be detected by 
measuring the resulting increase in RNA/ protein expression, etc., according to the methods 
herein. It will be appreciated that modulators can also be combined with transcriptional 
activators or inhibitors to find modulators which inhibit transcriptional activation or 
transcriptional repression. Either expression of the nucleic acids and proteins herein or any 

25 additional nucleic acids or proteins activated by the nucleic acids or proteins herein, or both, can 
be monitored. 

In an embodiment, the invention provides a method for identifying compositions that 
modulate the activity or expression of a polynucleotide or polypeptide of the invention. For 
example, a test compound, whether a small or large molecule, is placed in contact with a cell, 
30 plant (or plant tissue or explant), or composition comprising the polynucleotide or polypeptide of 
interest and a resulting effect on the cell, plant, (or tissue or explant) or composition is evaluated 
by monitoring, either directly or indirectly, one or more of: expression level of the polynucleotide 
or polypeptide, activity (or modulation of the activity) of the polynucleotide or polypeptide. In 
some cases, an alteration in a plant phenotype can be detected following contact of a plant (or 
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plant cell, or tissue or explant) with the putative modulator, e.g., by modulation of expression or 
activity of a polynucleotide or polypeptide of the invention. 

SUBSEQUENCES 

5 Also contemplated are uses of polynucleotides, also referred to herein as 

oligonucleotides, typically having at least 12 bases, preferably at least 15, more preferably at least 
20, 30, or 50 bases, which hybridize under at least highly stringent (or ultra-high stringent or 
ultra-ultra- high stringent conditions) conditions to a polynucleotide sequence described above. 
The polynucleotides may be used as probes, primers, sense and antisense agents, and the like, 

10 according to methods as noted supra. 

Subsequences of the polynucleotides of the invention, including polynucleotide 
fragments and oligonucleotides are useful as nucleic acid probes and primers. An oligonucleotide 
suitable for use as a probe or primer is at least about 15 nucleotides in length, more often at least 
about 18 nucleotides, often at least about 21 nucleotides, frequently at least about 30 nucleotides, 

1 5 or about 40 nucleotides, or more in length. A nucleic acid probe is useful in hybridization 
protocols, e.g., to identify additional polypeptide homologues of the invention, including 
protocols for microarray experiments. Primers can be annealed to a complementary target DNA 
strand by nucleic acid hybridization to form a hybrid between the primer and the target DNA 
strand, and then extended along the target DNA strand by a DNA polymerase enzyme. Primer 

20 pairs can be used for amplification of a nucleic acid sequence, e.g., by the polymerase chain 

reaction (PCR) or other nucleic-acid amplification methods. See Sambrook and Ausubel, supra. 

In addition, the invention includes an isolated or recombinant polypeptide including a 
subsequence of at least about 15 contiguous amino acids encoded by the recombinant or isolated 
polynucleotides of the invention. For example, such polypeptides, or domains or fragments 

25 thereof, can be used as immunogens, e.g., to produce antibodies specific for the polypeptide 

sequence, or as probes for detecting a sequence of interest. A subsequence can range in size from 
about 15 amino acids in length up to and including the full length of the polypeptide. 

PRODUCTION OF TRANSGENIC PLANTS 
Modification of Traits 

30 The polynucleotides of the invention are favorably employed to produce transgenic plants 

with various traits, or characteristics, that have been modified in a desirable manner, e.g., to 
improve the seed characteristics of a plant. For example, alteration of expression levels or 
• patterns (e.g., spatial or temporal expression patterns) of one or more of the transcription factors 
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(or transcription factor homologues) of the invention, as compared with the levels of the same 
protein found in a wild type plant, can be used to modify a plant's traits. An illustrative example 
of trait modification, improved biochemical characteristics, by altering expression levels of a 
particular transcription factor is described further in the Examples and the Sequence Listing. 

5 Antisense and Cosuppression Approaches 

In addition to expression of the nucleic acids of the invention as gene replacement or 

plant phenotype modification nucleic acids, the nucleic acids are also useful for sense and anti- 
sense suppression of expression, e.g., to down-regulate expression of a nucleic acid of the 
invention, e.g., as a further mechanism for modulating plant phenotype. That is, the nucleic acids 

1 0 of the invention, or subsequences or anti-sense sequences thereof, can be used to block expression 
of naturally occurring homologous nucleic acids. A variety of sense and anti -sense technologies 
are known in the art, e.g., as set forth in Lichtenstein and Nellen (1997) Antisense Technology: A 
Practical Approach IRL Press at Oxford University, Oxford, England. In general, sense or anti- 
sense sequences are introduced into a cell, where they are optionally amplified, e.g., by 

1 5 transcription. Such sequences include both simple oligonucleotide sequences and catalytic 
sequences such as ribozymes. 

For example, a reduction or elimination of expression (i.e., a "knock-out") of a 
transcription factor or transcription factor homologue polypeptide in a transgenic plant, e.g., to 
modify a plant trait, can be obtained by introducing an antisense construct corresponding to the 

20 polypeptide of interest as a cDNA. For antisense suppression, the transcription factor or homologue 
cDNA is arranged in reverse orientation (with respect to the coding sequence) relative to the 
promoter sequence in the expression vector. The introduced sequence need not be the full length 
cDNA or gene, and need not be identical to the cDNA or gene found in the plant type to be 
transformed. Typically, the antisense sequence need only be capable of hybridizing to the target 

25 gene or RNA of interest. Thus, where the introduced sequence is of shorter length, a higher 

degree of homology to the endogenous transcription factor sequence will be needed for effective 
antisense suppression. While antisense sequences of various lengths can be utilized, preferably, 
the introduced antisense sequence in the vector will be at least 30 nucleotides in length, and 
improved antisense suppression will typically be observed as the length of the antisense sequence 

30 increases. Preferably, the length of the antisense sequence in the vector will be greater than 100 
nucleotides. Transcription of an antisense construct as described results in the production of 
RNA molecules that are the reverse complement of mRNA molecules transcribed from the 
endogenous transcription factor gene in the plant cell. 
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Suppression of endogenous transcription factor gene expression can also be achieved 
using a ribozyme. Ribozymes are RNA molecules that possess highly specific endoribonuclease 
activity. The production and use of ribozymes are disclosed in U.S. Patent No. 4,987,071 and 
U.S. Patent No. 5,543,508. Synthetic ribozyme sequences including antisense RNAs can be used 
5 to confer RNA cleaving activity on the antisense RNA, such that endogenous mRNA molecules 
that hybridize to the antisense RNA are cleaved, which in turn leads to an enhanced antisense 
inhibition of endogenous gene expression. 

Vectors in which RNA encoded by a transcription factor or transcription factor 
homologue cDNA is over-expressed can also be used to obtain co-suppression of a corresponding 

10 endogenous gene, e.g., in the manner described in U.S. Patent No. 5,23 1 ,020 to Jorgensen. Such 
co-suppression (also termed sense suppression) does not require that the entire transcription factor 
cDNA be introduced into the plant cells, nor does it require that the introduced sequence be 
exactly identical to the endogenous transcription factor gene of interest. However, as with 
antisense suppression, the suppressive efficiency will be enhanced as specificity of hybridization 

15 is increased, e.g., as the introduced sequence is lengthened, and/or as the sequence similarity 
between the introduced sequence and the endogenous transcription factor gene is increased. 

Vectors expressing an untranslatable form of the transcription factor mRNA, e.g., 
sequences comprising one or more stop codon, or nonsense mutation) can also be used to 
suppress expression of an endogenous transcription factor, thereby reducing or eliminating it's 

20 activity and modifying one or more traits. Methods for producing such constructs are described 
in U.S. Patent No. 5,583,021. Preferably, such constructs are made by introducing a premature 
stop codon into the transcription factor gene. Alternatively, a plant trait can be modified by gene 
silencing using double-strand RNA (Sharp (1999) Genes and Development 13: 139-141). 

Another method for abolishing the expression of a gene is by insertion mutagenesis using 

25 the T-DNA of Agrobacterium tumefaciens. After generating the insertion mutants, the mutants 
can be screened to identify those containing the insertion in a transcription factor or transcription 
factor homologue gene. Plants containing a single transgene insertion event at the desired gene 
can be crossed to generate homozygous plants for the mutation (Koncz et al. (1992) Methodsjn 
Arabidopsis Research. World Scientific). 

30 Alternatively, a plant phenotype can be altered by eliminating an endogenous gene, such 

as a transcription factor or transcription factor homologue, e.g., by homologous recombination 
(Kempin et al. (1997) Nature 389:802). 

A plant trait can also be modified by using the cre-lox system (for example, as described 
in US Pat. No. 5,658,772). A plant genome can be modified to include first and second lox sites 
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that are then contacted with a Cre recombinase. If the lox sites are in the same orientation, the 
intervening DNA sequence between the two sites is excised. If the lox sites are in the opposite 
orientation, the intervening sequence is inverted. 

The polynucleotides and polypeptides of this invention can also be expressed in a plant in 
5 the absence of an expression cassette by manipulating the activity or expression level of the 
endogenous gene by other means. For example, by ectopically expressing a gene by T-DNA 
activation tagging (Ichikawa et al. (1997) Nature 390 698-701; Kakimoto et al. (1996) Science 
274: 982-985). This method entails transforming a plant with a gene tag containing multiple 
transcriptional enhancers and once the tag has inserted into the genome, expression of a flanking 
10 gene coding sequence becomes deregulated. In another example, the transcriptional machinery in 
a plant can be modified so as to increase transcription levels of a polynucleotide of the invention 
{See, e.g., PCT Publications WO 96/06166 and WO 98/53057 which describe the modification of 
the DNA binding specificity of zinc finger proteins by changing particular amino acids in the 
DNA binding motif). 

1 5 The transgenic plant can also include the machinery necessary for expressing or altering 

the activity of a polypeptide encoded by an endogenous gene, for example by altering the 
phosphorylation state of the polypeptide to maintain it in an activated state. 

Transgenic plants (or plant cells, or plant explants, or plant tissues) incorporating the 
polynucleotides of the invention and/or expressing the polypeptides of the invention can be 

20 produced by a variety of well established techniques as described above. Following construction 
of a vector, most typically an expression cassette, including a polynucleotide, e.g., encoding a 
transcription factor or transcription factor homologue, of the invention, standard techniques can 
be used to introduce the polynucleotide into a plant, a plant cell, a plant explant or a plant tissue 
of interest. Optionally, the plant cell, explant or tissue can be regenerated to produce a transgenic 

25 plant. 

The plant can be any higher plant, including gymnosperms, monocotyledonous and 
dicotyledenous plants. Suitable protocols are available for Leguminosae (alfalfa, soybean, clover, 
etc.), Umbelliferae (carrot, celery, parsnip), Cruciferae (cabbage, radish, rapeseed, broccoli, etc.), 
Curcurbitaceae (melons and cucumber), Gramineae (wheat, corn, rice, barley, millet, etc.), 
30 Solanaceae (potato, tomato, tobacco, peppers, etc.), and various other crops. See protocols 

described in Ammirato et al. (1984) Handbook of Plant Cell Culture -Crop Species . Macmillan 
Publ.Co. Shimamoto et al. (1989) Nature 338:274-276; Fromm et al. (1990) Bio/Technology 
8:833-839; and Vasil et al. (1990) Bio/Technology 8:429-434. 
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Transformation and regeneration of both monocotyledonous and dicotyledonous plant 
cells is now routine, and the selection of the most appropriate transformation technique will be 
determined by the practitioner. The choice of method will vaiy with the type of plant to be 
transformed; those skilled in the art will recognize the suitability of particular methods for given 
5 plant types. Suitable methods can include, but are not limited to: electroporation of plant 
protoplasts; liposome-mediated transformation; polyethylene glycol (PEG) mediated 
transformation; transformation using viruses; micro-injection of plant cells; micro-projectile 
bombardment of plant cells; vacuum infiltration; and Agrobacterium tumeficiens mediated 
transformation. Transformation means introducing a nucleotide sequence in a plant in a manner to 

1 0 cause stable or transient expression of the sequence. 

Successful examples of the modification of plant characteristics by transformation with 
cloned sequences which serve to illustrate the current knowledge in this field of technology, and 
which are herein incorporated by reference, include: U.S. Patent Nos. 5,571,706; 5,677,175; 
5,510,471; 5,750,386; 5,597,945; 5,589,615; 5,750,871; 5,268,526; 5,780,708; 5,538,880; 

15 5,773,269; 5,736,369 and 5,610,042. 

Following transformation, plants are preferably selected using a dominant selectable 
marker incorporated into the transformation vector. Typically, such a marker will confer 
antibiotic or herbicide resistance on the transformed plants, and selection of transformants can be 
accomplished by exposing the plants to appropriate concentrations of the antibiotic or herbicide. 

20 After transformed plants are selected and grown to maturity, those plants showing a 

modified trait are identified. The modified trait can be any of those traits described above. 
Additionally, to confirm that the modified trait is due to changes in expression levels or activity 
of the polypeptide or polynucleotide of the invention can be determined by analyzing mRNA 
expression using Northern blots, RT-PCR or microarrays, or protein expression using 

25 immunoblots or Western blots or gel shift assays. 

INTEGRATED SYSTEMS — SE QUENCE IDENTITY 

Additionally, the present invention may be an integrated system, computer or computer 
readable medium that comprises an instruction set for determining the identity of one or more 
sequences in a database. In addition, the instruction set can be used to generate or identify 
30 sequences that meet any specified criteria. Furthermore, the instruction set may be used to 

associate or link certain functional benefits, such improved biochemical characteristics, with one 
or more identified sequence. 
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For example, the instruction set can include, e.g., a sequence comparison or other 
alignment program, e.g., an available program such as, for example, the Wisconsin Package 
Version 10.0, such as BLAST, FASTA, PILEUP, FINDPATTERNS or the like (GCG, Madision, 
WI). Public sequence databases such as GenBank, EMBL, Swiss-Prot and PIR or private 
5 sequence databases such as PhytoSeq (Incyte Pharmaceuticals, Palo Alto, CA) can be searched. 
Alignment of sequences for comparison can be conducted by the local homology 
algorithm of Smith and Waterman (1981) Adv. Appl. Math. 2:482, by the homology alignment 
algorithm of Needleman and Wunsch (1970) J. Mol.Biol. 48:443, by the search for similarity 
method of Pearson and Lipman (1988) Proc. Natl. Acad. Sci. U.S.A. 85: 2444, by computerized 

10 implementations of these algorithms. After alignment, sequence comparisons between two (or 
more) polynucleotides or polypeptides are typically performed by comparing sequences of the 
two sequences over a comparison window to identify and compare local regions of sequence 
similarity. The comparison window can be a segment of at least about 20 contiguous positions, 
usually about 50 to about 200, more usually about 100 to about 150 contiguous positions. A 

1 5 description of the method is provided in Ausubel et al., supra. 

A variety of methods of determining sequence relationships can be used, including 
manual alignment and computer assisted sequence alignment and analysis. This later approach is 
a preferred approach in the present invention, due to the increased throughput afforded by 
computer assisted methods. As noted above, a variety of computer programs for performing 

20 sequence alignment are available, or can be produced by one of skill. 

One example algorithm that is suitable for determining percent sequence identity and 
sequence similarity is the BLAST algorithm, which is described in Altschul et al. J. Mol. Biol 
215:403-410 (1990). Software for performing BLAST analyses is publicly available, e.g., 
through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 

25 algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul et aL, supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. 

30 The word hits are then extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, for 
nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 
0) and N (penalty score for mismatching residues; always < 0). For amino acid sequences, a 
scoring matrix is used to calculate the cumulative score. Extension of the word hits in each 
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direction are halted when: the cumulative alignment score falls offby the quantity X from its 
maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of 
one or more negative-scoring residue alignments; or the end of either sequence is reached. The 
BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. 
5 The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 1 1, an 
expectation (E) of 10, a cutoff of 100, M=5, N=-4, and a comparison of both strands. For amino 
acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) 
of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff (1989) Proc. Natl. Acad. 
Sci. USA 89:10915). 

1 0 In addition to calculating percent sequence identity, the BLAST algorithm also performs 

a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul (1993) 
Proc. Natl. Acad. Sci. USA 90:5873-5787). One measure of similarity provided by the BLAST 
algorithm is the smallest sum probability (P(N)), which provides an indication of the probability 
by which a match between two nucleotide or amino acid sequences would occur by chance. For 

1 5 example, a nucleic acid is considered similar to a reference sequence (and, therefore, in this 

context, homologous) if the smallest sum probability in a comparison of the test nucleic acid to 
the reference nucleic acid is less than about 0.1, or less than about 0.01, and or even less than 
about 0.001. An additional example of a useful sequence alignment algorithm is PILEUP. 
PILEUP creates a multiple sequence alignment from a group of related sequences using 

20 progressive, pairwise alignments. The program can align, e.g., up to 300 sequences of a 
maximum length of 5,000 letters. 

The integrated system, or computer typically includes a user input interface allowing a 
user to selectively view one or more sequence records corresponding to the one or more character 
strings, as well as an instruction set which aligns the one or more character strings with each other 

25 or with an additional character string to identify one or more region of sequence similarity. The 
system may include a link of one or more character strings with a particular phenotype or gene 
function. Typically, the system includes a user readable output element which displays an 
alignment produced by the alignment instruction set. 

The methods of this invention can be implemented in a localized or distributed 

30 computing environment. In a distributed environment, the methods may implemented on a single 
computer comprising multiple processors or on a multiplicity of computers. The computers can 
be linked, e.g. through a common bus, but more preferably the computers) are nodes on a 
network. The network can be a generalized or a dedicated local or wide-area network and, in 
certain preferred embodiments, the computers may be components of an intra-net or an internet. 
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Thus, the invention provides methods for identifying a sequence similar or homologous 
to one or more polynucleotides as noted herein, or one or more target polypeptides encoded by 
the polynucleotides, or otherwise noted herein and may include linking or associating a given 
plant phenotype or gene function with a sequence. In the methods, a sequence database is 
5 provided (locally or across an inter or intra net) and a query is made against the sequence 

database using the relevant sequences herein and associated plant phenotypes or gene functions. 

Any sequence herein can be entered into the database, before or after querying the 
database. This provides for both expansion of the database and, if done before the querying step, 
for insertion of control sequences into the database. The control sequences can be detected by the 
10 query to ensure the general integrity of both the database and the query. As noted, the query can 
be performed using a web browser based interface. For example, the database can be a 
centralized public database such as those noted herein, and the querying can be done from a 
remote terminal or computer across an internet or intranet. 

EXAMPLES 

1 5 The following examples are intended to illustrate but not limit the present invention. 

EXAMPLE I. FULL LENGTH GENE IDENTIFICATION AND CLONING 

Putative transcription factor sequences (genomic or ESTs) related to known transcription 
factors were identified in the Arabidopsis thaliana GenBank database using the tblastn sequence 
analysis program using default parameters and a P-value cutoff threshold of -4 or -5 or lower, 
20 depending on the length of the query sequence. Putative transcription factor sequence hits were 
then screened to identify those containing particular sequence strings. If the sequence hits 
contained such sequence strings, the sequences were confirmed as transcription factors. 

Alternatively, Arabidopsis thaliana cDNA libraries derived from different tissues or 
treatments, or genomic libraries were screened to identify novel members of a transcription 
25 family using a low stringency hybridization approach. Probes were synthesized using gene 
specific primers in a standard PCR reaction (annealing temperature 60° C) and labeled with 32 P 
dCTP using the High Prime DNA Labeling Kit (Boehringer Mannheim). Purified radiolabeled 
probes were added to filters immersed in Church hybridization medium (0.5 M NaP0 4 pH 7.0, 
7% SDS, 1 % w/v bovine serum albumin) and hybridized overnight at 60 °C with shaking. Filters 
30 were washed two times for 45 to 60. minutes with 1 xSCC, 1% SDS at 60° C. 

To identify additional sequence 5' or 3' of a partial cDNA sequence in a cDNA library, 5* 
and 3' rapid amplification of cDNA ends (RACE) was performed using the Marathon™ cDNA 
amplification kit (Clontech, Palo Alto, CA). Generally, the method entailed first isolating 
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poly(A) mRNA, performing first and second strand cDNA synthesis to generate double stranded 
cDNA, blunting cDNA ends, followed by ligation of the Marathon™ Adaptor to the cDNA to 
form a library of adaptor-ligated ds cDNA. 

Gene-specific primers were designed to be used along with adaptor specific primers for 
5 both 5' and 3' RACE reactions. Nested primers, rather than single primers, were used to increase 
PCR specificity. Using 5' and 3' RACE reactions, 5' and 3' RACE fragments were obtained, 
sequenced and cloned. The process can be repeated until 5' and 3' ends of the full-length gene 
were identified. Then the full-length cDNA was generated by PCR using primers specific to 5' 
and 3' ends of the gene by end-to-end PCR. 

10 EXAMPLE n. CONSTRUCTION OF EXPRESSION VECTORS 

The sequence was amplified from a genomic or cDNA library using primers specific to 
sequences upstream and downstream of the coding region. The expression vector was pMEN20 
or pMEN65, which are both derived from pMON316 (Sanders et al, (1987) Nucleic Acids 
Research 15:1543-58) and contain the CaMV 35S promoter to express transgenes. To clone the 

1 5 sequence into the vector, both pMEN20 and the amplified DN A fragment were digested 

separately with Sail and NotI restriction enzymes at 37° C for 2 hours. The digestion products 
were subject to electrophoresis in a 0.8% agarose gel and visualized by ethidium bromide 
staining. The DNA fragments containing the sequence and the linearized plasmid were excised 
and purified by using a Qiaquick gel extraction kit (Qiagen, CA). The fragments of interest were 

20 ligated at a ratio of 3 : 1 (vector to insert). Ligation reactions using T4 DNA ligase (New England 
Biolabs, MA) were carried out at 16° C for 16 hours. The ligated DNAs were transformed into 
competent cells of the E. coli strain DHSalpha by using the heat shock method. The 
transformations were plated on LB plates containing 50 mg/1 kanamycin (Sigma). 

Individual colonies were grown overnight in five milliliters of LB broth containing 50 

25 mg/1 kanamycin at 37° C. Plasmid DNA was purified by using Qiaquick Mini Prep kits (Qiagen, 
CA). 

EXAMPLE HI. TRANSFORMATION OF AGRO BACTERIUM WITH THE 
EXPRESSION VECTOR 

After the plasmid vector containing the gene was constructed, the vector was used to 
30 transform Agrobacterium tumefaciens cells expressing the gene products. The stock of 

Agrobacterium tumefaciens cells for transformation were made as described by Nagel et al. 
(1990) FEMS Microbiol Letts . 67: 325-328. Agrobacterium strain ABI was grown in 250 ml LB 
medium (Sigma) overnight at 28°C with shaking until an absorbance (A«oo) of 0.5 - L0 was 
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reached. Cells were harvested by centrifugation at 4,000 x g for 15 min at 4°C. Cells were then 
resuspended in 250 ul chilled buffer (1 mM HEPES, pH adjusted to 7.0 with KOH). Cells were 
centrifiiged again as described above and resuspended in 125 ul chilled buffer. Cells were then 
centrifuged and resuspended two more times in the same HEPES buffer as described above at a 
5 volume of 100 \i\ and 750 ul, respectively. Resuspended cells were then distributed into 40 ul 
aliquots, quickly frozen in liquid nitrogen, and stored at -80° C. 

Agrobacterium cells were transformed with plasmids prepared as described above 
following the protocol described by Nagel et al. For each DNA construct to be transformed, 50 - 
100 ng DNA (generally resuspended in 10 mM Tris-HCl, 1 mM EDTA, pH 8.0) was mixed with 

10 40 ul of Agrobacterium cells. The DNA/cell mixture was then transferred to a chilled cuvette 
with a 2mm electrode gap and subject to a 2.5 kV charge dissipated at 25 uF and 200 uF using a 
Gene Pulser II apparatus (Bio-Rad). After electroporation, cells were immediately resuspended 
in 1 .0 ml LB and allowed to recover without antibiotic selection for 2 - 4 hours at 28° C in a 
shaking incubator. After recovery, cells were plated onto selective medium of LB broth 

15 containing 100 ug/ml spectinomycin (Sigma) and incubated for 24-48 hours at 28° C. Single 

colonies were then picked and inoculated in fresh medium. The presence of the plasmid construct 
was verified by PCR amplification and sequence analysis. 

EXAMPLE IV. TRANSFORMATION OF ARABIDOPSIS PLANTS WITH 
AGROBACTERIUM TUMEFACIENS WITH EXPRESSION VECTOR 

20 After transformation of Agrobacterium tumefaciens with plasmid vectors containing the 

gene, single Agrobacterium colonies were identified, propagated, and used to transform 
Arabidopsis plants. Briefly, 500 ml cultures of LB medium containing 50 mg/1 kanamycin were 
inoculated with the colonies and grown at 28° C with shaking for 2 days until an absorbance 
(Aax)) of > 2.0 is reached. Cells were then harvested by centrifugation at 4,000 x g for 10 min, 

25 and resuspended in infiltration medium (1/2 X Murashige and Skoog salts (Sigma), 1 X 

Gamborg's B-5 vitamins (Sigma), 5.0% (w/v) sucrose (Sigma), 0.044 uM benzylamino purine 
(Sigma), 200 ul/L Silwet L-77 (Lehle Seeds) until an absorbance (A<ao) of 0.8 was reached. 

Prior to transformation, Arabidopsis thaliana seeds (ecotype Columbia) were sown at a 
density of -10 plants per 4" pot onto Pro-Mix BX potting medium (Hummert International) 

30 covered with fiberglass mesh (18 mm X 16 mm). Plants were grown under continuous 

illumination (50-75 uE/m 2 /sec) at 22-23° C with 65-70% relative humidity. After about 4 weeks, 
primary inflorescence stems (bolts) are cut off to encourage growth of multiple secondary bolts. 
After flowering of the mature secondary bolts, plants were prepared for transformation by 
removal of all siliques and opened flowers. 
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The pots were then immersed upside down in the mixture of Agrobacterium infiltration 
medium as described above for 30 sec, and placed on their sides to allow draining into al'x2' 
flat surface covered with plastic wrap. After 24 h, the plastic wrap was removed and pots are 
turned upright- The immersion procedure was repeated one week later, for a total of two 
5 immersions per pot. Seeds were then collected from each transformation pot and analyzed 
following the protocol described below. 

EXAMPLE V. IDENTIFICATION OF ARABIDOPSIS PRIMARY 
TRANSFORMANTS 

Seeds collected from the transformation pots were sterilized essentially as follows. Seeds 

10 were dispersed into in a solution containing 0.1% (v/v) Triton X-100 (Sigma) and sterile H 2 0 and 
washed by shaking the suspension for 20 min. The wash solution was then drained and replaced 
with fresh wash solution to wash the seeds for 20 min with shaking. After removal of the second 
wash solution, a solution containing 0.1% (v/v) Triton X-100 and 70% ethanol (Equistar) was 
added to the seeds and the suspension was shaken for 5 min. After removal of the 

1 5 ethanol/detergent solution, a solution containing 0.1% (v/v) Triton X-100 and 30% (v/v) bleach 
(Clorox) was added to the seeds, and the suspension was shaken for 10 min. After removal of the 
bleach/detergent solution, seeds were then washed five times in sterile distilled H 2 0. The seeds 
were stored in the last wash water at 4°C for 2 days in the dark before being plated onto antibiotic 
selection medium (1 X Murashige and Skoog salts (pH adjusted to 5.7 with 1M KOH), 1 X 

20 Gamborg's B-5 vitamins, 0.9% phytagar (Life Technologies), and 50 mg/1 kanamycin). Seeds 
were germinated under continuous illumination (50-75 fiE/m 2 /sec) at 22-23° C. After 7-10 days 
of growth under these conditions, kanamycin resistant primary transformants (Ti generation) 
were visible and obtained. These seedlings were transferred first to fresh selection plates where 
the seedlings continued to grow for 3-5 more days, and then to soil (Pro-Mix BX potting 

25 medium). 

Primary transformants were crossed and progeny seeds (T 2 ) collected; kanamycin 
resistant seedlings were selected and analyzed. The expression levels of the recombinant 
polynucleotides in the transformants varies from about a 5% expression level increase to a least a 
100% expression level increase. Similar observations are made with respect to polypeptide level 
30 expression. 
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EXAMPLE VI. IDENTIFICATION OF ARABDDOPSIS PLANTS WITH 
TRANSCRIPTION FACTOR GENE KNOCKOUTS 

The screening of insertion mutagenized Arabidopsis collections for null mutants in a 

known target gene was essentially as described in Krysan et al (1999) Plant Cell 1 1:2283-2290. 

5 Briefly, gene-specific primers, nested by 5-250 base pairs to each other, were designed from the 

5' and 3' regions of a known target gene. Similarly, nested sets of primers were also created 

specific to each of the T-DNA or transposon ends (the "right" and "left" borders). All possible 

combinations of gene specific and T-DNA/transposon primers were used to detect by PCR an 

insertion event within or close to the target gene. The amplified DNA fragments were then 

1 0 sequenced which allows the precise determination of the T-DNA/transposon insertion point 

relative to the target gene. Insertion events within the coding or intervening sequence of the 

genes were deconvoluted from a pool comprising a plurality of insertion events to a single unique 

mutant plant for functional characterization. The method is described in more detail in Yu and 

Adam, US Application Serial No. 09/177,733 filed October 23, 1998. 

15 EXAMPLE VEL IDENTIFICATION OF MODIFIED BIOCHEMICAL 

CHARACTERISTICS PHENOTYPE IN OVEREXPRESSOR OR GENE KNOCKOUT 
PLANTS 

Experiments were performed to identify those transformants or knockouts that exhibited 
modified biochemical characteristics. Among the biochemicals that were assayed were insoluble 

20 sugars, such as arabinose, fucose, galactose, mannose, rhamnose or xylose or the like; prenyl 

lipids, such as lutein, beta-carotene, xanthophyll-1, xanthophyll-2, chlorophylls A or B, or alpha-, 
delta- or gamma-tocopherol or the like; fatty acids, such as 16:0 (palmitic acid), 16:1 (palmitoleic 
acid), 18:0 (stearic acid), 18:1 (oleic acid), 18:2 Ginoleic acid), 20:0 , 18:3 (linolenic acid), 20:1 
(eicosenoic acid), 20:2, 22: 1 (erucic acid) or the like; waxes, such as by altering the levels of C29, 

25 C3 1 , or C33 alkanes; sterols, such as brassicasterol, campesterol, stigmasterol, sitosterol or 
stigmastanol or the like, glucosinolates, protein or oil levels 

Fatty acids were measured using two methods depending on whether the tissue was from 
leaves or seeds. For leaves, lipids were extracted and esterified with hot methanolic H2S04 and 
partitioned into hexane from methanolic brine. For seed fatty acids, seeds were pulverized and 

30 extracted in methanol:heptane:toluene:2,2-dimethoxypropane:H2S04 (39:34:20:5:2) for 90 
minutes at 80°C. After cooling to room temperature the upper phase, containing the seed fatty 
acid esters, was subjected to GC analysis. Fatty acid esters from both seed and leaf tissues were 
analyzed with a Supelco SP-2330 column. 
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Glucosinolates were purified from seeds or leaves by first heating the tissue at 95°C for 
10 minutes. Preheated ethanol:water (50:50) is and after heating at 95°C for a further 10 minutes, 
the extraction solvent is applied to a DEAE Sephadex column which had been previously 
equilibrated with 0.5 M pyridine acetate. Desulfoglucosinolates were eluted with 300 ul water 
5 and analyzed by reverse phase HPLC monitoring at 226 nm. 

For wax alkanes, samples were extracted using an identical method as fatty acids and 
extracts were analyzed on a HP 5890 GC coupled with a 5973 MSD. Samples were 
chromatographed on a J&W DB35 mass spectrometer (J&W Scientific). 

To measure prenyl lipids levels, seeds or leaves were pulverized with 1 to 2% pyrogallol 

10 as an antioxidant. For seeds, extracted samples were filtered and a portion removed for 

tocopherol and carotenoid/chlorophyll analysis by HPLC. The remaining material was saponified 
for sterol determination. For leaves, an aliquot was removed and diluted with methanol and 
chlorophyll A, chlorophyll B, and total carotenoids measured by spectrophotometry by 
determining absorbance at 665.2 nm, 652.5 nm, and 470 nm. An aliquot was removed for 

1 5 tocopherol and carotenoid/chlorophyll composition by HPLC using a Waters uBondapak C 1 8 
column (4.6 mm x 150 mm). The remaining methanolic solution was saponified with 10% KOH 
at 80°C for one hour. The samples were cooled and diluted with a mixture of methanol and 
water. A solution of 2% methylene chloride in hexane was mixed in and the samples were 
centrifuged. The aqueous methanol phase was again re-extracted 2% methylene chloride in 

20 hexane and, after centrifugation, the two upper phases were combined and evaporated. 2% 

methylene chloride in hexane was added to the tubes and the samples were then extracted with 
one ml of water. The upper phase was removed, dried, and resuspended in 400 ul of 2% 
methylene chloride in hexane and analyzed by gas chromatography using a 50 m DB-5ms (0.25 
mm ID, 0.25 urn phase, J&W Scientific). 

25 Insoluble sugar levels were measured by the method essentially described by Reiter et al., 

Plant Journal 12:335-345. This method analyzes the neutral sugar composition of cell wall 
polymers found in Arabidopsis leaves. Soluble sugars were separated from sugar polymers by 
extracting leaves with hot 70% ethanol. The remaining residue containing the insoluble 
polysaccharides was then acid hydrolyzed with allose added as an internal standard. Sugar 

30 monomers generated by the hydrolysis were then reduced to the corresponding alditols by 

treatment with NaBH4, then were acetylated to generate the volatile alditol acetates which were 
then analyzed by GC-FID. Identity of the peaks was determined by comparing the retention times 
of known sugars converted to the corresponding alditol acetates with the retention times of peaks 
from wild-type plant extracts. Alditol acetates were analyzed on a Supelco SP-2330 capillary 
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column (30 m x 250 um x 0.2 urn) using a temperature program beginning at 180° C for 2 
minutes followed by an increase to 220° C in 4 minutes. After holding at 220° C for 10 minutes, 
the oven temperature is increased to 240° C in 2 minutes and held at this temperature for 10 
minutes and brought back to room temperature. 
5 To identify plants with alterations in total seed oil or protein content, 150mg of seeds 

from T2 progeny plants were subjected to analysis by Near Infrared Reflectance (NIR) using a 
Foss NirSystems Model 6500 with a spinning cup transport system. 

Table 3 shows the phenotypes observed for particular overexpressor or knockout plants 
and provides the SEQ ID No., the internal reference code (GID), whether a knockout or 
1 0 overexpressor plant was analyzed and the observed phenotype. 



Table 3 



SEQ ED No. 


GID 


Knockout (KO) or 
overexpressor (OE) 


Phenotype observed 


1 


G214 


OE 


[ncrease in leaf fatty acids, for example 100% increase in 
18:0 fatty acid. Also up to 100% increase in leaf 
chlorophyll and 100% increase in leaf carotenoids 


3 


G231 


OE 


Up to 5% increase in leaf 1 8:3 fatty acid 


5 


G274 


OE 


Up to 50% increase in leaf arabinose 


7 


G307 


OE 


Altered in leaf insoluble sugars, for example up to 44% 
decrease in mannose. 


9 


G346 


OE 


Altered leaf fatty acids, for example 25% increase in 16:3 
and altered insoluble sugars, for example up to 25% 
increase in fucose 


11 


G598 


OE 


Altered in insoluble sugars, for example up to 20% 
decrease in rhamnose and up to 10% increase in galactose 


13 ; 


G605 


OE 


Altered in leaf fatty acids, for example up to 20% 
increase in 16:1 fatty acid. 


15 


G777 


OE 


Altered in insoluble sugars, for example up to 60% 
increase in leaf rhamnose 


17 


G869 


OE 


Alteration in leaf fatty acids eg up to 39% decrease in 
16:0 fatty acid; up to 43% increase in fucose 


19 


G1133 


OE 


Up to 34% decrease in leaf lutein 


21 


G1266 


OE 


Alteration in leaf fatty acids, for example up to 50% 
increase in 18:0 fatty acid. Alterations in leaf insoluble 
sugars, for example a 45% decrease in rhamnose 


23 


G1324 


OE 


Up to 65% decrease in leaf lutein and up to 84% increase 
in leaf xanthophyll 
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25 


G1337 


OE 


Alteration in leaf fatty acids, for example up to 28% 
increase in 18:1 fatty acid 


27 


G975 


OE 


Up to 13-fold increase in wax in leaves 



For a particular overexpressor that shows a less beneficial biochemical characteristic, it 
may be more useful to select a plant with a decreased expression of the particular transcription 
factor. For a particular knockout that shows a less beneficial biochemical characteristic, it may be 
5 more useful to select a plant with an increased expression of the particular transcription factor. 



EXAMPLE Vm. IDENTIFICATION OF HOMOLOGOUS SEQUENCES 

Homologous sequences from Arabidopsis and plant species other than Arabidopsis were 
identified using database sequence search tools, such as the Basic Local Alignment Search Tool 
(BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403-410; and Altschul et al. (1997) Nucl. Acid 
1 0 Res, 25: 3389-3402). The tblastx sequence analysis programs were employed using the 

BLOSUM-62 scoring matrix (Henikoff, S. and Henikoff, J. G. (1992) Proc. Natl. Aca d. Sci. USA 
89: 10915-10919). 

Identified Arabidopsis homologous sequences are provided in Figure 2 and included in 
the Sequence Listing. The percent sequence identity among these sequences is as low as 47% 

15 sequence identity. Additionally, the entire NCBI GenBank database was filtered for sequences 
from all plants except Arabidopsis thaliana by selecting all entries in the NCBI GenBank 
database associated with NCBI taxonomic ID 33090 (Viridiplantae; all plants) and excluding 
entries associated with taxonomic ID 3701 {Arabidopsis thaliana). These sequences were 
compared to sequences representing genes of SEQ IDs Nos. 1-54 on 9/26/2000 using the 

20 Washington University TBLASTX algorithm (version 2.0a 1 9MP). For each gene of SEQ IDs 
Nos. 1-54, individual comparisons were ordered by probability score (P- value), where the score 
reflects the probability that a particular alignment occurred by chance. For example, a score of 
3.6e-40 is 3.6 x 10" 40 . For up to ten species, the gene with the lowest P-value (and therefore the 
most likely homolog) is listed in Figure 3. 

25 In addition to P-values, comparisons were also scored by percentage identity. Percentage 

identity reflects the degree to which two segments of DNA or protein are identical over a 
particular length. The ranges of percent identity between the non-Arabidopsis genes shown in 
Figure 3 and the Arabidopsis genes in the sequence listing are: SEQ ID No. 1 : 38%-89%; SEQ ID 
No. 3: 64%-88%; SEQ ID No. 5: 44%-84°/o; SEQ ID No. 7: 35%-86%; SEQ ID No. 9: 43%-77%; 

30 SEQ ID No. 1 1 : 43%-85%; SEQ ID No. 13: 41%-76%; SEQ ID No. 15: 34%-63%; SEQ ID No. 
17: 31%-68%; SEQ ED No. 19: 26%^4%; SEQ ID No. 21: 52%-70%; SEQ ID No. 23: 37%- 
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93%; SEQ ID No. 25: 37%-58%; SEQ ID No. 27: 48%-92%; SEQ ID No. 29: 42%-88%; SEQ ID 
No. 31: 47%-90%; SEQ ID No. 33: 45%-69%; SEQ ID No. 35: 42%-94%; SEQ ID No. 37: 38%- 
85%; SEQ ID No. 39: 49%-93%; SEQ ID No. 41: 36%-64%; and SEQ ID No. 43: 36%-70%. 

The polynucleotides and polypeptides in the Sequence Listing and the identified 
homologous sequences may be stored in a computer system and have associated or linked with 
the sequences a function, such as that the polynucleotides and polypeptides are useful for 
modifying the biochemical characteristics of a plant. 

All references, publications, patents and other documents herein are incorporated by 
reference in their entirety for all purposes. Although the invention has been described with 
reference to the embodiments and examples above, it should be understood that various 
modifications can be made without departing from the spirit of the invention. 
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What is claimed is: 

1 . A transgenic plant with a modified biochemical characteristic, which plant comprises a 
recombinant polynucleotide comprising a nucleotide sequence selected from the group consisting 
of: 

5 (a) a nucleotide sequence encoding a polypeptide comprising a sequence selected from 

SEQ ID Nos. 2N, where N=l-22, or a complementary nucleotide sequence thereof; 

(b) a nucleotide sequence encoding a polypeptide comprising a conservatively substituted 
variant of a polypeptide of (a); 

(c) a nucleotide sequence comprising a sequence selected from those of SEQ ID Nos. 2N- 
10 1 , where N=l-22, or a complementary nucleotide sequence thereof; 

(d) a nucleotide sequence comprising silent substitutions in a nucleotide sequence of (c); 

(e) a nucleotide sequence which hybridizes under stringent conditions to a nucleotide 
sequence of one or more of: (a), (b), (c), or (d); 

(f) a nucleotide sequence comprising at least 15 consecutive nucleotides of a sequence of 
15 any of (a)-(e); 

(g) a nucleotide sequence comprising a subsequence or fragment of any of (a)-(f), which 
subsequence or fragment encodes a polypeptide that modifies a plant's biochemical 
characteristic; 

(h) a nucleotide sequence having at least 3 1% sequence identity to a nucleotide sequence 
20 ofanyof(aKg); 

(i) a nucleotide sequence having at least 60% identity sequence identity to a nucleotide 
sequence of any of (a)-(g); 

(j) a nucleotide sequence which encodes a polypeptide having at least 31% identity 
sequence identity to a polypeptide of SEQ ID Nos. 2N, where N=l-22; 
25 (k) a nucleotide sequence which encodes a polypeptide having at least 60% identity 

sequence identity to a polypeptide of SEQ ID Nos. 2N, where N=l-22; and 
(1) a nucleotide sequence which encodes a polypeptide having at least 65% sequence 
identity to a conserved domain of a polypeptide of SEQ ID Nos. 2N, where N=l-22. 

30 2. The transgenic plant of claim 1 , further comprising a constitutive, inducible, or tissue- 
active promoter operably linked to said nucleotide sequence. 

3. The transgenic plant of claim 1, wherein the plant is selected from the group consisting 
of: soybean, wheat, com, potato, cotton, rice, oilseed rape, sunflower, alfalfa, sugarcane, turf, 
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banana, blackberry, blueberry, strawberry, raspberry, cantaloupe, carrot, cauliflower, coffee, 
cucumber, eggplant, grapes, honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, 
pineapple, spinach, squash, sweet corn, tobacco, tomato, watermelon, rosaceous fruits, and 
vegetable brassicas. 

5 

4. An isolated or recombinant polynucleotide comprising a nucleotide sequence selected 

from the group consisting of: 

(a) a nucleotide sequence encoding a polypeptide comprising a sequence selected from 
SEQ ID Nos. 2N, where N=l-22, or a complementary nucleotide sequence thereof; 
10 (b) a nucleotide sequence encoding a polypeptide comprising a conservatively substituted 

variant of a polypeptide of (a); 

(c) a nucleotide sequence comprising a sequence selected from those of SEQ ID Nos. 2N- 
1, where N=l-22, or a complementary nucleotide sequence thereof; 

(d) a nucleotide sequence comprising silent substitutions in a nucleotide sequence of (c); 
15 (e) a nucleotide sequence which hybridizes under stringent conditions to a nucleotide 

sequence of one or more of: (a), (b), (c), or (d); 

(f) a nucleotide sequence comprising at least 15 consecutive nucleotides of a sequence of 
any of (a)-(e); 

(g) a nucleotide sequence comprising a subsequence or fragment of any of (a)-(f), which 
20 subsequence or fragment encodes a polypeptide that modifies a plant's biochemical 

characteristic; 

(h) a nucleotide sequence having at least 31% sequence identity to a nucleotide sequence 
of any of(a)-(g); 

(i) a nucleotide sequence having at least 60% identity sequence identity to a nucleotide 
25 sequence of any of (a)-(g); 

(j) a nucleotide sequence which encodes a polypeptide having at least 31% identity 
sequence identity to a polypeptide of SEQ ID Nos. 2N, where N=l -22; 
(k) a nucleotide sequence which encodes a polypeptide having at least 60% identity 
sequence identity to a polypeptide of SEQ ID Nos. 2N, where N=l-22; and 
30 (1) a nucleotide sequence which encodes a conserved domain of a polypeptide having at 

least 65% sequence identity to a conserved domain of a polypeptide of SEQ ID Nos. 2N, 
where N=l-22. 
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5 . The isolated or recombinant polynucleotide of claim 4, further comprising a constitutive, 
inducible, or tissue-active promoter operably linked to the nucleotide sequence. 

6. A cloning or expression vector comprising the isolated or recombinant polynucleotide of 
5 claim 4. 

7. A cell comprising the cloning or expression vector of claim 6. 

8. A transgenic plant comprising the isolated or recombinant polynucleotide of claim 4. 

10 

9. A composition produced by one or more of: 

(a) incubating one or more polynucleotide of claim 4 with a nuclease; 

(b) incubating one or more polynucleotide of claim 4 with a restriction enzyme; 

(c) incubating one or more polynucleotide of claim 4 with a polymerase; 

1 5 (d) incubating one or more polynucleotide of claim 4 with a polymerase and a primer; 

(e) incubating one or more polynucleotide of claim 4 with a cloning vector, or 

(f) incubating one or more polynucleotide of claim 4 with a cell. 



20 



25 



10. A composition comprising two or more different polynucleotides of claim 4. 

11. An isolated or recombinant polypeptide comprising a subsequence of at least about 15 
contiguous amino acids encoded by the recombinant or isolated polynucleotide of claim 4. 

1 2. A plant ectopically expressing an isolated polypeptide of claim 1 1 . 



13. A method for producing a plant having a modified biochemical characteristic, the method 
comprising altering the expression of the isolated or recombinant polynucleotide of claim 4 or the 
expression levels or activity of a polypeptide of claim 1 1 in a plant, thereby producing a modified 
plant, and selecting, the modified plant for a modified biochemical characteristic thereby 

30 providing the modified plant with a modified biochemical characteristic. 

14. The method of claim 13, wherein the polynucleotide is a polynucleotide of claim 4. 
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15. A method of identifying a factor that is modulated by or interacts with a polypeptide 
encoded by a polynucleotide of claim 4, the method comprising: 

(a) expressing a polypeptide encoded by the polynucleotide in a plant; and 

(b) identifying at least one factor that is modulated by or interacts with the polypeptide. 

5 

16. The method of claim 15, wherein the identifying is performed by detecting binding by the 
polypeptide to a promoter sequence, or detecting interactions between an additional protein and 
the polypeptide in a yeast two hybrid system. 

10 17. The method of claim 1 5 , wherein the identifying is performed by detecting expression of 
a factor by hybridization to a microarray, subtractive hybridization or differential display. 

18. A method of identifying a molecule that modulates activity or expression of a 
polynucleotide or polypeptide of interest, the method comprising: 

1 5 (a) placing the molecule in contact with a plant comprising the polynucleotide or 

polypeptide encoded by the polynucleotide of claim 4; and, 
(b) monitoring one or more of: 

(i) expression level of the polynucleotide in the plant; 

(ii) expression level of the polypeptide in the plant; 

20 (iii) modulation of an activity of the polypeptide in the plant; or 

(iv) modulation of an activity of the polynucleotide in the plant. 

1 9. An integrated system, computer or computer readable medium comprising one or more 
character strings corresponding to a polynucleotide of claim 4, or to a polypeptide encoded by the 

25 polynucleotide. 

20. The integrated system, computer or computer readable medium of claim 19, further 
comprising a link between said one or more sequence strings to a modified plant biochemical 
characteristics phenotype. 

30 

21. A method of identifying a sequence similar or homologous to one or more 
polynucleotides of claim 4, or one or more polypeptides encoded by the polynucleotides, the 
method comprising: 

(a) providing a sequence database; and, 
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(b) querying the sequence database with one or more target sequences corresponding to 
the one or more polynucleotides or to the one or more polypeptides to identify one or . 
more sequence members of the database that display sequence similarity or homology to 
one or more of the one or more target sequences. 

5 

22. The method of claim 2 1 , wherein the querying comprises aligning one or more of the 
target sequences with one or more of the one or more sequence members in the sequence 
database. 

10 23. The method of claim 21, wherein the querying comprises identifying one or more of the 
one or more sequence members of the database that meet a user-selected identity criteria with one 
or more of the target sequences. 

24. The method of claim 2 1 , further comprising linking the one or more of the 

1 5 polynucleotides of claim 4, or encoded polypeptides, to a modified plant biochemical 
characteristics phenotype. 

25. A plant comprising altered expression levels of an isolated or recombinant polynucleotide 
of claim 4. 

20 

26. A plant comprising altered expression levels or the activity of an isolated or recombinant 
polypeptide of claim 11. 

27. A plant lacking a nucleotide sequence encoding a polynucleotide of claim 1 1 . 

25 
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Figure 2 
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Figure 3A 
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Figure 3B 
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Figure 3C 
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37 


G308 


7780253 


1.10E-57 


Luius japonicus 


37 


G308 


6733213 


3.70E-48 


1 i/mnofcimn ocr*i ilon ti im 
LyUUpui olV/UI 1 oolAJIcl 11UM I 


39 


G1944 


9204125 


5.50E-52 


OlyCinS ill a A 


39 


G1944 


7624850 


6.60E-45 


oussypiurn aruurcuiri 


39 


G1944 


7784135 


7.20E-32 


1 aIi if ImAnliti ia 

lows japonicus 


! 39 


G1944 


9280727 


2.60E-29 


uryza saxiva 


39 


G1944 


7009437 


1.30E-28 


z.ea mays 


39 


G1944 


7536402 


1.30E-28 


oorgnum Dicoior 


39 


G1944 


8104258 


6.50E-27 


Lycopersicon escuientum 


39 


G1944 


2213533 


3.50E-23 


r*isum sativum 


39 


G1944 


4165182 


7.10E-17 


Anfirrhini im m oil ic 


39 


G1944 


6555294 


2.90E-16 


Dim io tafl^a 


1 41 


G326 


7410432 


1.10E-37 


Lycopersicon escuieniurn 


41 


G326 


3618319 


2.90E-32 


UlyZa SaUVa 


I 41 


G326 


7571599 


4.90E-30 


Meaicago iruncaiuia 


41 


G326 


7232283 


6.30E-28 


Glycine max 


41 


G326 


7323708 


6.00E-27 


1 vcrknp.rcinnn hircutufn 

LjUUpol Olwvl 1 1 III OVJIUI 1 1 


41 


G326 


4091805 


2.30E-19 


Malus domestica 


41 


G326 


6917805 ; 


6.50E-19 


Lycopersicon pennellii 


41 


G326 


3341722 


2.50E-18 


Raphanus sativus 


41 


G326 


4557092 


7.50E-18 


Pinus radiata 


41 


G326 


2303680 


4.70E-17 


Brassica napus 


43 


G1387 


8285738 


1.40E-46 


Glycine max 


43 


G1387 


8103850 i 


5.20E-46 


Lycopersicon esculentum 


43 


G1387 


5056299 ! 


1.10E-20 


Brassica rapa subsp. pekinensis 
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Figure 3E 



otU IU INO. 


oiu 




r -value 




43 


G1387 


9278522 


1.50E-18 


Lotus japonicus 


43 


G1387 


5859978 


2.00E-15 


Pinus taeda 


43 


G1387 


7766740 


4.70E-14 


Medicago truncatula 


43 


G1387 


9427282 


1.40E-12 


Triticum aestivum 


43 


G1387 


3857766 


3.40E-12 


Populus balsamifera subsp. trichocarpa 


43 


G1387 


19506 


4.60E-12 


Lupinus polyphyllus 


43 


G1387 


7273843 


2.20E-11 


Oryza sativa 
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MBI-20 Sequence Listi.ng.ST25 
SEQUENCE LISTING 



<110> Creelman, Robert 
Yu, Guo- Liang 
Adam, Luc 

Riechmann, Jose Luis 
Heard, Jacqueline 
Samaha , Raymond 
Pilgrim, Marsha 
Pineda, Omaira 
Jiang, Cai-Zhong 

<120> Plant Biochemistry-Related Genes 



<130> MBI-0020 



<150> 60/164,132 

<151> 1999-11-17 

<150> 60/197,899 

<151> 2000-04-17 



<150> Plant Trait Modification III 
<151> 2000-08-22 



<160> 44 



<170> Patentln version 3.0 



<210> 1 

<211> 2240 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (238) . . (2064) 

<223> G214 

<400> 1 

tgagatttct ccatttccgt agcttctggt ctcttttctt tgtttcattg atcaaaagca 60 

aatcacttct tcttcttctt cttctcgatt tcttactgtt ttcttatcca acgaaatctg 120 

gaattaaaaa tggaatcttt atcgaatcca agctgatttt gtttctttca ttgaatcatc 180 

tctctaaagt ggaattttgt aaagagaaga tctgaagttg tgtagaggag cttagtg 237 

atg gag aca aat teg tct gga gaa gat ctg gtt att aag act egg aag 285 
Met Glu Thr Asn Ser Ser Gly Glu Asp Leu Val He Lys Thr Arg Lys 
15 10 15 

cca tat acg ata aca aag caa cgt gaa agg tgg act gag gaa gaa cat 333 
Pro Tyr Thr He Thr Lys Gin Arg Glu Arg Trp Thr Glu Glu Glu His 
20 25 30 

aat aga ttc att gaa get ttg agg ctt tat ggt aga gca tgg cag aag 381 
Asn Arg Phe He Glu Ala Leu Arg Leu Tyr Gly Arg Ala Trp Gin Lys 
35 40 45 

att gaa gaa cat gta gca aca aaa act get gtc cag ata aga agt cac 429 
He Glu Glu His Val Ala Thr Lys Thr Ala Val Gin He Arg Ser His 
50 55 60 

get cag aaa ttt ttc tec aag gta gag aaa gag get gaa get aaa ggt 477 
Ala Gin Lys Phe Phe Ser Lys Val Glu Lys Glu Ala Glu Ala Lys Gly 
65 70 75 80 

gta get atg ggt caa gcg eta gac ata get att cct cct cca egg cct 525 
Val Ala Met Gly Gin Ala Leu Asp He Ala He Pro Pro Pro Arg Pro 
85 90 95 

aag cgt aaa cca aac aat cct tat cct cga aag acg gga agt gga acg 573 
Lys Arg Lys Pro Asn Asn Pro Tyr Pro Arg Lys Thr Gly Ser Gly Thr 
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100 



MBI-20 Sequence Listing. ST2 5 
105 110 



ate ctt atg tea aaa acg ggt gtg aat gat gga aaa gag tec ctt gga 
He Leu Met Ser Lys Thr Gly Val Asn Asp Gly Lys Glu Ser Leu Gly 
115 120 125 



621 



tea gaa aaa gtg teg cat cct gag atg gee aat gaa gat cga caa caa 
Ser Glu Lys Val Ser His Pro Glu Met Ala Asn Glu Asp Arg Gin Gin 
130 135 140 



669 



tea aag cct gaa gag aaa act ctg cag gaa gac aac tgt tea gat tgt 
Ser Lys Pro Glu Glu Lys Thr Leu Gin Glu Asp Asn Cys Ser Asp Cys 
145 150 155 160 



717 



ttc act cat cag tat etc tct get gca tec tec atg aat aaa agt tgt 
Phe Thr His Gin Tyr Leu Ser Ala Ala Ser Ser Met Asn Lys Ser Cys 
165 170 175 



765 



ata gag aca tea aac gca age act ttc cgc gag ttc ttg cct tea egg 
He Glu Thr Ser Asn Ala Ser Thr Phe Arg Glu Phe Leu Pro Ser Arg 
180 185 190 



813 



gaa gag gga agt cag aat aac agg gta aga aag gag tea aac tea gat 
Glu Glu Gly Ser Gin Asn Asn Arg Val Arg Lys Glu Ser Asn Ser Asp 
195 200 205 



861 



ttg aat gca aaa tct ctg gaa aac ggt aat gag caa gga cct cag act 
Leu Asn Ala Lys Ser Leu Glu Asn Gly Asn Glu Gin Gly Pro Gin Thr 
210 215 220 



909 



tat ccg atg cat ate cct gtg eta gtg cca ttg ggg age tea ata aca 
Tyr Pro Met His He Pro Val Leu Val Pro Leu Gly Ser Ser He Thr 
225 230 235 240 



957 



agt tct eta tea cat cct cct tea gag cca gat agt cat ccc cac aca 
Ser Ser Leu Ser His Pro Pro Ser Glu Pro Asp Ser His Pro His Thr 
245 250 255 



1005 



gtt gca gga gat tat cag teg ttt cct aat cat ata atg tea acc ctt 
Val Ala Gly Asp Tyr Gin Ser Phe Pro Asn His He Met Ser Thr Leu 
260 265 270 



1053 



tta caa aca ccg get ctt tat act gee gca act ttc gec tea tea ttt 
Leu Gin Thr Pro Ala Leu Tyr Thr Ala Ala Thr Phe Ala Ser Ser Phe 
275 280 285 



1101 



tgg cct ccc gat tct agt ggt ggc tea cct gtt cca ggg aac tea cct 
Trp Pro Pro Asp Ser Ser Gly Gly Ser Pro Val Pro Gly Asn Ser Pro 
290 295 300 



1149 



ccg aat ctg get gee atg gee gca gec act gtt gca get get agt get 
Pro Asn Leu Ala Ala Met Ala Ala Ala Thr Val Ala Ala Ala Ser Ala 
305 310 315 320 



1197 



tgg tgg get gee aat gga tta tta cct tta tgt get cct ctt agt tea 
Trp Trp Ala Ala Asn Gly Leu Leu Pro Leu Cys Ala Pro Leu Ser Ser 
325 330 335 



124S 



ggt ggt ttc act agt cat cct cca tct act ttt gga cca tea tgt gat 
Gly Gly Phe Thr Ser His Pro Pro Ser Thr Phe Gly Pro Ser Cys Asp 
340 345 350 



1293 



gta gag tac aca aaa gca age act tta caa cat ggt tct gtg cag age 
Val Glu Tyr Thr Lys Ala Ser Thr Leu Gin His Gly Ser Val Gin Ser 
355 360 365 



1341 



cga gag caa gaa cac tec gag gca tea aag get cga tct tea ctg gac 
Arg Glu Gin Glu His Ser Glu Ala Ser Lys Ala Arg Ser Ser Leu Asp 
370 375 380 



1389 



tea gag gat gtt gaa aat aag agt aaa cca gtt tgt cat gag cag cct 
Ser Glu Asp Val Glu Asn Lys Ser Lys Pro Val Cys His Glu Gin Pro 
385 390 395 400 



1437 



tct gca aca cct gag agt gat gca aag ggt tea gat gga gca gga gac 
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MBI-20 Sequence Listing.ST25 
Ser Ala Thr Pro Glu Ser Asp Ala Lys Gly Ser Asp Gly Ala Gly Asp 
405 410 415 

aga aaa caa gtt gac egg tec teg tgt ggc tea aac act ccg teg agt 1533 
Arg Lys Gin Val Asp Arg Ser Ser Cys Gly Ser Asn Thr Pro Ser Ser 
420 425 430 

agt gat gat gtt gag gcg gat gca tea gaa agg caa gag gat ggc ace 1581 
Ser Asp Asp Val Glu Ala Asp Ala Ser Glu Arg Gin Glu Asp Gly Thr 
435 440 445 

aat ggt gag gtg aaa gaa acg aat gaa gac act aat aaa cct caa act 1629 
Asn Gly Glu Val Lys Glu Thr Asn Glu Asp Thr Asn Lys Pro Gin Thr 
450 455 460 

tea gag tec aat gca cgc cgc agt aga ate age tec aat ata ace gat 1677 
Ser Glu Ser Asn Ala Arg Arg Ser Arg He Ser Ser Asn He Thr Asp 
465 470 475 480 

cca tgg aag tct gtg tct gac gag ggt cga att gee ttc caa get etc 1725 
Pro Trp Lys Ser Val Ser Asp Glu Gly Arg He Ala Phe Gin Ala Leu 
485 490 495 

ttc tec aga gag gta ttg ccg caa agt ttt aca tat cga gaa gaa cac 1773 
Phe Ser Arg Glu Val Leu Pro Gin Ser Phe Thr Tyr Arg Glu Glu His 
500 505 510 

aga gag gaa gaa caa caa caa caa gaa caa aga tat cca atg gca ctt 1821 
Arg Glu Glu Glu Gin Gin Gin Gin Glu Gin Arg Tyr Pro Met Ala Leu 
515 520 525 

gat ctt aac ttc aca get cag tta aca cca gtt gat gat caa gag gag 1869 
Asp Leu Asn Phe Thr Ala Gin Leu Thr Pro Val Asp Asp Gin Glu Glu 
530 535 540 

aag aga aac aca gga ttt ctt gga ate gga tta gat get tea aag eta 1917 
Lys Arg Asn Thr Gly Phe Leu Gly He Gly Leu Asp Ala Ser Lys Leu 
545 550 555 560 

atg agt aga gga aga aca ggt ttt aaa cca tac aaa aga tgt tec atg 1965 
Met Ser Arg Gly Arg Thr Gly Phe Lys Pro Tyr Lys Arg Cys Ser Met 
565 570 575 

gaa gee aaa gaa agt aga ate etc aac aac aat cct ate att cat gtg 2013 
Glu Ala Lys Glu Ser Arg He Leu Asn Asn Asn Pro He He His Val 
580 585 590 

gaa cag aaa gat ccc aaa egg atg egg ttg gaa act caa get tec aca 2061 
Glu Gin Lys Asp Pro Lys Arg Met Arg Leu Glu Thr Gin Ala Ser Thr 
595 600 605 

tga gactctattt tcatctgatc tgttgtttgt actctgtttt taagttttca 2114 

agaccactgc tacattttct ttttcttttg aggectttgt atttgtttcc ttgtccatag 2174 

tcttcctgta acatttgact ctgtattatt caacaaatca taaactgttt aatctttttt 2234 

tttcca 2240 

<210> 2 

<211> 608 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 2 

Met Glu Thr Asn Ser Ser Gly Glu Asp Leu Val He Lys Thr Arg Lys 
1 5 10 15 

Pro Tyr Thr He Thr Lys Gin Arg Glu Arg Trp Thr Glu Glu Glu His 
20 25 30 
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MBI-20 Sequence Listing. ST25 
Asn Arg Phe lie Glu Ala Leu Arg Leu Tyr Gly Arg Ala Trp Gin Lys 
35 40 45 



He Glu Glu His Val Ala Thr Lys Thr Ala Val Gin He Arg Ser His 
50 55 60 



Ala Gin Lys Phe Phe Ser Lys Val Glu Lys Glu Ala Glu Ala Lys Gly 
65 70 75 80 



Val Ala Met Gly Gin Ala Leu Asp He Ala He Pro Pro Pro Arg Pro 
85 ~ 90 95 



Lys Arg Lys Pro Asn Asn Pro Tyr Pro Arg Lys Thr Gly Ser Gly Thr 
100 105 110 



He Leu Met Ser Lys Thr Gly Val Asn Asp Gly Lys Glu Ser Leu Gly 
115 * 120 * 125 



Ser Glu Lys Val Ser His Pro Glu Met Ala Asn Glu Asp Arg Gin Gin 
130 13S 140 



Ser Lys Pro Glu Glu Lys Thr Leu Gin Glu Asp Asn Cys Ser Asp Cys 
145 150 155 160 



Phe Thr His Gin Tyr Leu Ser Ala Ala Ser Ser Met Asn Lys Ser Cys 
165 " 170 175 



He Glu Thr Ser Asn Ala Ser Thr Phe Arg Glu Phe Leu Pro Ser Arg 
180 185 190 



Glu Glu Gly Ser Gin Asn Asn Arg Val Arg Lys Glu Ser Asn Ser Asp 
195 200 ~ 205 



Leu Asn Ala Lys Ser Leu Glu Asn Gly Asn Glu Gin Gly Pro Gin Thr 
210 215 220 



Tyr Pro Met His He Pro Val Leu Val Pro Leu Gly Ser Ser He Thr 
225 230 235 240 



Ser Ser Leu Ser His Pro Pro Ser Glu Pro Asp Ser His Pro His Thr 
245 250 255 



Val Ala Gly Asp Tyr Gin Ser Phe Pro Asn His He Met Ser Thr Leu 
260 265 270 



Leu Gin Thr Pro Ala Leu Tyr Thr Ala Ala Thr Phe Ala Ser Ser Phe 
275 280 285 



Trp Pro Pro Asp Ser Ser Gly Gly Ser Pro Val Pro Gly Asn Ser Pro 
290 295 300 



Pro Asn Leu Ala Ala Met Ala Ala Ala Thr Val Ala Ala Ala Ser Ala 
305 310 315 320 



Trp Trp Ala Ala Asn Gly Leu Leu Pro Leu Cys Ala Pro Leu Ser Ser 
325 330 335 
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MBI-20 Sequence Listing. ST25 

Gly Gly Phe Thr Ser His Pro Pro Ser Thr Phe Gly Pro Ser Cys Asp 
340 345 350 



Val Glu Tyr Thr Lys Ala Ser Thr Leu Gin His Gly Ser Val Gin Ser 
355 360 365 



Arg Glu Gin Glu His Ser Glu Ala Ser Lys Ala Arg Ser Ser Leu Asp 
370 375 380 



Ser Glu Asp Val Glu Asn Lys Ser Lys Pro Val Cys His Glu Gin Pro 
385 " 390 395 400 



Ser Ala Thr Pro Glu Ser Asp Ala Lys Gly Ser Asp Gly Ala Gly Asp 
405 410 415 



Arg Lys Gin Val Asp Arg Ser Ser Cys Gly Ser Asn Thr Pro Ser Ser 
420 425 430 



Ser Asp Asp Val Glu Ala Asp Ala Ser Glu Arg Gin Glu Asp Gly Thr 
435 440 445 



Asn Gly Glu Val Lys Glu Thr Asn Glu Asp Thr Asn Lys Pro Gin Thr 
450 455 460 



Ser Glu Ser Asn Ala Arg Arg Ser Arg lie Ser Ser Asn He Thr Asp 
465 470 475 480 



Pro Trp Lys Ser Val Ser Asp Glu Gly Arg He Ala Phe Gin Ala Leu 
485 490 495 



Phe Ser Arg Glu Val Leu Pro Gin Ser Phe Thr Tyr Arg Glu Glu His 
500 505 510 



Arg Glu Glu Glu Gin Gin Gin Gin Glu Gin Arg Tyr Pro Met Ala Leu 
515 520 525 



Asp Leu Asn Phe Thr Ala Gin Leu Thr Pro Val Asp Asp Gin Glu Glu 
530 535 540 



Lys Arg Asn Thr Gly Phe Leu Gly He Gly Leu Asp Ala Ser Lys Leu 
545 550 555 560 



Met Ser Arg Gly Arg Thr Gly Phe Lys Pro Tyr Lys Arg Cys Ser Met 
565 570 575 



Glu Ala Lys Glu Ser Arg He Leu Asn Asn Asn Pro He He His Val 
580 ** 585 590 



Glu Gin Lys Asp Pro Lys Arg Met Arg Leu Glu Thr Gin Ala Ser Thr 
595 600 605 



<210> 3 
<211> 916 
<212> DNA 

<213> Arabidopsis thaliana 
<220> 
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MBI-20 Sequence Listing. ST25 

<221> CDS 

<222> (88).. (888) 

<223> G231 

<400> 3 

ttccatatct cttccatttc gctctctatt tcacatcccc atataacata atatacaatc 60 

acacatatca tttctatata gtattta atg ggg aga cag cca tgc tgt gac aag 114 

Met Gly Arg Gin Pro Cys Cys Asp Lys 
1 5 

eta ggg gtg aag aaa ggg ccg tgg acg gtg gag gaa gat aag aag ctt 162 
Leu Gly Val Lys Lys Gly Pro Trp Thr Val Glu Glu Asp Lys Lys Leu 
10 ' 15 20 * * 25 

ata aac ttc ata eta acc aat ggc cat tgt tgc tgg cgt get ttg ccg 210 
lie Asn Phe lie Leu Thr Asn Gly His Cys Cys Trp Arg Ala Leu Pro 
30 35 " 40 

aag ctg gec ggt etc cgt cgc tgt gga aag age tgc cgc etc egg tgg 258 
Lys Leu Ala Gly Leu Arg Arg Cys Gly Lys Ser Cys Arg Leu Arg Trp 
45 50 55 

act aac tat etc egg cct ggc tta aaa cga ggc ctt etc teg cat gat 306 
Thr Asn Tyr Leu Arg Pro Gly Leu Lys Arg Gly Leu Leu Ser His Asp 
60 65 ~ 70 

gaa gaa caa ctt gtc ata gat ctt cat get aat etc ggc aat aag tgg 354 
Glu Glu Gin Leu Val lie Asp Leu His Ala Asn Leu Gly Asn Lys Trp 
75 80 85 

tct aag ata get tea aga tta cct gga aga aca gat aac gaa ata aaa 402 
Ser Lys He Ala Ser Arg Leu Pro Gly Arg Thr Asp Asn Glu He Lys 
90 95 100 105 

aac cat tgg aat act cat ate aag aag aaa ctt ctt aag atg gga ate 450 
Asn His Trp Asn Thr His He Lys Lys Lys Leu Leu Lys Met Gly He 
110 115 120 

gat cct atg acc cat caa ccc eta aat caa gaa cct tct aat ate gat 498 
Asp Pro Met Thr His Gin Pro Leu Asn Gin Glu Pro Ser Asn He Asp 
125 130 135 

aat tec aaa acc att ccg tec aat cca gac gat gtc tea gtg gaa cca 546 
Asn Ser Lys Thr He Pro Ser Asn Pro Asp Asp Val Ser Val Glu Pro 
140 145 150 

aag aca act aac acg aaa tac gtg gag ata agt gtc acg aca aca gaa 594 
Lys Thr Thr Asn Thr Lys Tyr Val Glu He Ser Val Thr Thr Thr Glu 
155 160 165 

gaa gaa agt agt age acg gtt act gat caa aac agt teg atg gat aat 642 
Glu Glu Ser Ser Ser Thr Val Thr Asp Gin Asn Ser Ser Met Asp Asn 
170 175 180 185 

gaa aat cat eta att gac aac att tat gat gat gat gaa ttg ttt agt 690 
Glu Asn His Leu He Asp Asn He Tyr Asp Asp Asp Glu Leu Phe Ser 
190 195 200 

tac tta tgg tec gac gaa act act aaa gat gag gee tct tgg agt gat 73 8 

Tyr Leu Trp Ser Asp Glu Thr Thr Lys Asp Glu Ala Ser Trp Ser Asp 
205 210 215 

agt aac ttt ggt gtt ggt gga aca tta tat gac cac aat ate tec ggc 786 
Ser Asn Phe Gly Val Gly Gly Thr Leu Tyr Asp His Asn He Ser Gly 
220 225 230 

gee gat gca gat ttt ccg ata tgg tea ccg gaa aga ate aat gac gag 834 
Ala Asp Ala Asp Phe Pro He Trp Ser Pro Glu Arg He Asn Asp Glu 
235 240 245 

aag atg ttt ttg gat tat tgt caa gac ttt ggt gtt cat gat ttt ggg 882 
Lys Met Phe Leu Asp Tyr Cys Gin Asp Phe Gly Val His Asp Phe Gly 
250 255 260 265 
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ttt tga ctgttcacca ttgacatatt ggcaacgc 916 
Phe 



<210> 4 
<211> 266 
<212> PRT 

<213> Arabidopsie thaliana 
<400> 4 

Met Gly Arg Gin Pro Cys Cys Asp Lys Leu Gly Val Lys Lys Gly Pro 
15 10 15 

Trp Thr Val Glu Glu Asp Lys Lys Leu He Asn Phe He Leu Thr Asn 
20 * 25 30 



Gly His Cys Cys Trp Arg Ala Leu Pro Lys Leu Ala Gly Leu Arg Arg 
35 40 45 

Cys Gly Lys Ser Cys Arg Leu Arg Trp Thr Asn Tyr Leu Arg Pro Gly 
50 55 60 

Leu Lys Arg Gly Leu Leu Ser His Asp Glu Glu Gin Leu Val He Asp 
65 ~ 70 75 80 

Leu His Ala Asn Leu Gly Asn Lys Trp Ser Lys He Ala Ser Arg Leu 
85 90 95 

Pro Gly Arg Thr Asp Asn Glu He Lys Asn His Trp Asn Thr His He 
100 105 110 

Lys Lys Lys Leu Leu Lys Met Gly He Asp Pro Met Thr His Gin Pro 
115 " 120 125 

Leu Asn Gin Glu Pro Ser Asn He Asp Asn Ser Lys Thr He Pro Ser 
130 135 140 



Asn Pro Asp Asp Val Ser Val Glu Pro Lys Thr Thr Asn Thr Lys Tyr 
145 ** 150 155 160 



Val Glu He Ser Val Thr Thr Thr Glu Glu Glu Ser Ser Ser Thr Val 
165 . 170 175 

Thr Asp Gin Asn Ser Ser Met Asp Asn Glu Asn His Leu He Asp Asn 
180 185 190 

He Tyr Asp Asp Asp Glu Leu Phe Ser Tyr Leu Trp Ser Asp Glu Thr 
195 200 • 205 

Thr Lys Asp Glu Ala Ser Trp Ser Asp Ser Asn Phe Gly Val Gly Gly 
210 215 220 

Thr Leu Tyr Asp His Asn He Ser Gly Ala Asp Ala Asp Phe Pro He 
225 230 235 240 

Trp Ser Pro Glu Arg He Asn Asp Glu Lys Met Phe Leu Asp Tyr Cys 
245 250 255 
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Gin Asp Phe Gly Val His Asp Phe Gly Phe 
260 265 

<210> 5 
<211> 2371 
<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (172) . . (2037) 
<400> 5 

gacattattt taagtgtgtt ctctctctgt cacactcaca aagctttata ctttctggct 60 

actgcaagct catcagtgaa aagagcttaa accagagaga tctgataaga gaaattttag 120 

agtctctctg cttcaacaag atctacatcg accaggagat tagaaagaat c atg ggt 177 

Met Gly 
1 

tct aag cat aac cca cca ggg aat aac aga teg aga agt aca eta tct 225 
Ser Lys His Asn Pro Pro Gly Asn Asn Arg Ser Arg Ser Thr Leu Ser 
5 10 15 

eta etc gtt gtg gtt ggt tta tgt tgt ttc ttc tat ctt ctt gga gca 273 
Leu Leu Val Val Val Gly Leu Cys Cys Phe Phe Tyr Leu Leu Gly Ala 
20 25 30 

tgg caa aag agt ggg ttt ggt aaa gga gat age ata get atg gag att 321 
Trp Gin Lys Ser Gly Phe Gly Lys Gly Asp Ser lie Ala Met Glu lie 
35 40 45 50 

aca aag caa gcg cag tgt act gac att gtc act gat ctt gat ttt gaa 369 
Thr Lys Gin Ala Gin Cys Thr Asp He Val Thr Asp Leu Asp Phe Glu 
55 60 65 

cct cat cac aac aca gtg aag ate cca cat aaa get gat ccc aaa cct 417 
Pro His His Asn Thr Val Lys He Pro His Lys Ala Asp Pro Lys Pro 
70 75 80 

gtt tct ttc aaa ccg tgt gat gtg aag etc aag gat tac acg cct tgt 465 
Val Ser Phe Lys Pro Cys Asp Val Lys Leu Lys Asp Tyr Thr Pro Cys 
85 90 95 

caa gag caa gac cga get atg aag ttc ccg aga gag aac atg att tac 513 
Gin Glu Gin Asp Arg Ala Met Lys Phe Pro Arg Glu Asn Met He Tyr 
100 105 110 

aga gag aga cat tgt cct cct gat aat gag aag ctg cgt tgt ctt gtt 561 
Arg Glu Arg His Cys Pro Pro Asp Asn Glu Lys Leu Arg Cys Leu Val 
115 " 120 125 130 

cca get cct aaa ggg tat atg act cct ttc cct tgg cct aaa age aga 609 
Pro Ala Pro Lys Gly Tyr Met Thr Pro Phe Pro Trp Pro Lys Ser Arg 
135 140 145 

gat tat gtt cac tat get aat get cct ttc aag age ttg act gtc gaa 657 
Asp Tyr Val His Tyr Ala Asn Ala Pro Phe Lys Ser Leu Thr Val Glu 
150 155 160 

aaa get gga cag aat tgg gtt cag ttt caa ggg aat gtg ttt aaa ttc 705 
Lys Ala Gly Gin Asn Trp Val Gin Phe Gin Gly Asn Val Phe Lys Phe 
165 170 175 

cct ggt gga gga act atg ttt cct caa ggt get gat gcg tat att gaa 753 
Pro Gly Gly Gly Thr Met Phe Pro Gin Gly Ala Asp Ala Tyr He Glu 
180 185 " 190 

gag eta get tct gtt ate cct ate aaa gat ggc tct gtt aga acc gca 801 
Glu Leu Ala Ser val lie Pro He Lys Asp Gly Ser Val Arg Thr Ala 
195 200 205 210 
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ttg gac 
Leu Asp 


act 
Thr 


gga tgt 
Gly Cys 
215 


ggg 9tt get 

Gly Val Ala 


agt 
Ser 


tgg 

Trp 
220 


ggt 
Gly 


get 
Ala 


tat 
Tyr 


atg 
Met 


ctt 
Leu 
225 


aag 
Lys 


a a q 
849 


agg aat 
Arg Aen 


gtt 
Val 


ttg 
Leu 
230 


act 
Thr 


atg teg ttt 
Met Ser Phe 


gcg 
Ala 
235 


cca 
Pro 


agg 
Arg 


gat 
Asp 


aac 
Asn 


cac 
His 
240 


gaa 
Glu 


gca 
Ala 


897 


caa gtc 
Gin Val 


cag 
Gin 
245 


ttt 
Phe 


gcg 
Ala 


ctt gag aga 
Leu Glu Arg 
250 


ggt 
Gly 


gtt 
Val 


cca 
Pro 


gcg 
Ala 


att 
He 
255 


ate 
He 


get gtt 
Ala Val 


945 


ctt gga 
Leu Gly 
260 


tea 
Ser 


ate 
He 


ctt 
Leu 


ctt cct tac 
Leu Pro Tyr 
265 


cct 
Pro 


gca 
Ala 


aga 
Arg 


gee 
Ala 

270 


ttt 
Phe 


gac 

Asp 


atg get 
Met Ala 


33J 


caa tgc 
Gin Cys 
275 


tct 
Ser 


cga tgc 
Arg Cys 


ttg ata cca 
Leu He Pro 
260 


tgg 

Trp 


acc 
Thr 


gca 
Ala 
285 


aac 
Asn 


gag 
Glu 


gga 
Gly 


aca tac 
Thr Tyr 
290 


1U4 1 


tta atg 
Leu Met 


gaa 
Glu 


gta gat 
Val Asp 
295 


aga gtc ttg 
Arg Val Leu 


aga 
Arg 


cct 
Pro 
300 


gga 
Gly 


ggt 

Gly 


tac 
Tyr 


tgg 
Trp 


gtc 
Val 
305 


tta 
Leu 


1 A O Q 

1089 


teg ggt 
Ser Gly 


cct 
Pro 


cca 
Pro 
310 


ate 
He 


aac tgg aag 
Asn Trp Lys 


aca 
Thr 
315 


tgg 
Trp 


cac 
His 


aag 
Lys 


acg 
Thr 


tgg 
Trp 

320 


aac cga 
Asn Arg 


1137 


act aaa 
Thr Lys 


gca 
Ala 
325 


gag 
Glu 


eta 
Leu 


aat gee gag 
Asn Ala Glu 
330 


caa 
Gin 


aag 
Lys 


aga 
Arg 


ata 
He 


gag 

Glu 
335 


gga 
Gly 


ate 
He 


gca 
Ala 


1 1 DC 

Hob 


gag tec 
Glu Ser 
340 


tta 
Leu 


tgc tgg 
Cys Trp 


gag aag aag 
Glu Lys Lys 
345 


tat 
Tyr 


gag 
Glu 


aag 
Lys 


gga 
Gly 
350 


gac 

Asp 


att 
He 


gca 
Ala 


att 
He 


1233 


ttc aga 
Phe Arg 
355 


aag 
Lys 


aaa 
Lys 


ata 
He 


aac gat aga 
Asn Asp Arg 
360 


tea 
Ser 


tgc 
Cys 


gat 
Asp 
365 


aga 
Arg 


tea 
Ser 


aca 
Thr 


ccg 
Pro 


gtt 
Val 

370 


1281 


gac acc 
Asp Thr 


tgc 
Cys 


aaa 
Lys 


aga 
Arg 
375 


aag gac act 
Lys Asp Thr 


gac 
Asp 


gat 
Asp 
380 


gtc 
val 


tgg 
Trp 


tac 
Tyr 


aag 
Lys 


gag 
Glu 
385 


ata 
He 


1329 


gaa acg 
Glu Thr 


tgt 
Cys 


gta 
Val 
390 


aca 
Thr 


cca ttc cct 
Pro Phe Pro 


aaa 
Lys 
395 


gta 
val 


tea 
Ser 


aac 
Asn 


gaa 
Glu 


gaa 
Glu 
400 


gaa gtt 
Glu Val 


1377 


get gga 
Ala Gly 


gga 
Gly 
405 


aag eta 
Lys Leu 


aag aag ttc 
Lys Lys Phe 
410 


ccc 
Pro 


gag 
Glu 


agg 
Arg 


eta 
Leu 


ttc 
Phe 
415 


gca 
Ala 


gtg 
Val 


cct 
Pro 


1425 


cca agt 
Pro Ser 
420 


ate 
He 


tct aaa 
Ser Lys 


ggt ttg att 
Gly Leu He 
425 


aat 
Asn 


ggc 
Gly 


gtc 
Val 


gac 
Asp 
430 


gag 
Glu 


gaa 
Glu 


tea tac 
Ser Tyr 


1473 


caa gaa 
Gin Glu 
435 


gac 
Asp 


ate 
He 


aat 
Asn 


eta tgg aag 
Leu Trp Lys 

440 


aag 
Lys 


cga 
Arg 


gtg 
val 

445 


acc 
Thr 


gga 
Gly 


tac 
Tyr 


aag 
Lys 


aga 
Arg 
450 


1521 


att aac 
lie Asn 


aga 
Arg 


ctg 
Leu 


ata 
lie 
455 


ggt tec acc 
Gly Ser Thr 


aga 
Arg 


tac 
Tyr 
460 


cgt 
Arg 


aat 
Asn 


gtg 
Val 


atg 
Met 


gat atg 
Asp Met 
465 


1569 


aac gec 
Asn Ala 


99t 
Gly 


ctt ggt 
Leu Gly 
470 


gga ttc get 
Gly Phe Ala 


get 
Ala 
475 


gcg 
Ala 


ctt 
Leu 


gaa 
Glu 


teg 
Ser 


cct 
Pro 
480 


aaa 
Lys 


teg 
Ser 


1617 


tgg gtt 
Trp val 


atg 
Met 
485 


aat gtg 
Asn Val 


att cca acc 
He Pro Thr 
490 


att 
He 


aac 
Asn 


aag 
Lys 


aac 
Asn 


aca 
Thr 
495 


ttg 
Leu 


agt 
Ser 


gtt 
Val 


1665 


gtt tat 
Val Tyr 


gag 
Glu 


aga ggt 
Arg Gly 


etc att ggt 
Leu He Gly 


ate 
He 


tat cat 
Tyr His 

Page 


gac 
Asp 

9 


tgg 
Trp 


tgt 

Cys 


gaa ggc 
Glii Gly 


1713 
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MBI-20 Sequence Listing. ST25 
500 505 510 

ttt tea act tat cca aga aca tac gat ttc att cac get agt ggt gtc 1761 
Phe Ser Thr Tyr Pro Arg Thr Tyr Asp Phe lie His Ala Ser Gly Val 
515 520 525 530 

ttc age ttg tat cag cac age tgc aaa ctt gag gat att ctt ctt gaa 1809 
Phe Ser Leu Tyr Gin His Ser Cys Lys Leu Glu Asp He Leu Leu Glu 
535 540 545 

act gat egg att tta cga ccg gaa ggg att gtg att ttc egg gat gag 1857 
Thr Asp Arg Il.e Leu Arg Pro Glu Gly He Val He Phe Arg Asp Glu 
550 555 560 

gtt gat gtt ttg aat gat gtg agg aag ate gtt gat gga atg aga tgg 1905 
Val Asp Val Leu Asn Asp Val Arg Lys He Val Asp Gly Met Arg Trp 
565 570 575 

gat act aag tta atg gat cat gaa gac ggt cct etc gtg ccg gag aag 1953 
Asp Thr Lys Leu Met Asp His Glu Asp Gly Pro Leu Val Pro Glu Lys 
580 5B5 590 

att ctt gtc gee acg aag cag tat tgg gta gee ggc gac gat gga aac 2001 
He Leu Val Ala Thr Lys Gin Tyr Trp Val Ala Gly Asp Asp Gly Asn 
595 600 605 610 

aat tct ccg teg tct tct aat agt gaa gaa gaa taa aacaaaaaca 2047 
Asn Ser Pro Ser Ser Ser Asn Ser Glu Glu Glu 





615 




620 








aaaaactcct 


caggttacta 


agcttgaagt gtagatctat 


tttacaacat 


ctggaaaatt 


2107 


cttatcaaaa 


aaggaaggaa 


tcagaatttc 


cattaaagaa 


aggtgtcaaa 


aaaaagttgt 


2167 


aaaaetatat 


agtagtgatc 


aagacgaata 


tgtgcattta 


tgttttattt 


ttgttcccta 


2227 


gtttttaatt 


ttattttttt 


gaaggaagaa 


aaaattagtt 


ccatgtgttt 


ttgeaagata 


2287 


gttgaaacct 


tggacgcttg 


ttatgtatga 


tgcgatcttg 


acatttttta ataacagtta 


2347 


ttttaaataa 


atttatgata 


taaa 








2371 



<210> 6 
<211> 621 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 6 

Met Gly Ser Lys His Asn Pro Pro Gly Asn Asn Arg Ser Arg Ser Thr 
15 10 15 



Leu Ser Leu Leu Val Val Val Gly Leu Cys Cys Phe Phe Tyr Leu Leu 
20 25 30 



Gly Ala Trp Gin Lys Ser Gly Phe Gly Lys Gly Asp Ser He Ala Met 
35 40 45 



Glu He Thr Lys Gin Ala Gin Cys Thr Asp He Val Thr Asp Leu Asp 
50 ' 55 " 60 



Phe Glu Pro His His Asn Thr Val Lys He Pro His LyB Ala Asp Pro 
65 70 75 80 



Lys Pro Val Ser Phe Lys Pro Cys Asp Val Lys Leu Lys Asp Tyr Thr 
85 90 95 



Pro Cys Gin Glu Gin Asp Arg Ala Met Lys Phe Pro Arg Glu Asn Met 
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MBI-20 Sequence Listing.ST25 
100 105 110 

lie Tyr Arg Glu Arg His Cys Pro Pro Asp Asn Glu Lys Leu Arg Cys 
115 120 125 

Leu Val Pro Ala Pro Lys Gly Tyr Met Thr Pro Phe Pro Trp Pro Lys 
130 135 140 

Ser Arg Asp Tyr Val His Tyr Ala Asn Ala Pro Phe Lys Ser Leu Thr 
145 150 155 160 

Val Glu Lys Ala Gly Gin Asn Trp Val Gin Phe Gin Gly ABn Val Phe 
165 170 175 

Lys Phe Pro Gly Gly Gly Thr Met Phe Pro Gin Gly Ala Asp Ala Tyr 
180 185 190 

lie Glu Glu Leu Ala Ser Val He Pro He Lys Asp Gly Ser Val Arg 
195 200 205 

Thr Ala Leu Asp Thr Gly Cys Gly Val Ala Ser Trp Gly Ala Tyr Met 
210 215 220 

Leu Lys Arg Asn Val Leu Thr Met Ser Phe Ala Pro Arg Asp Asn His 
225 ~ 230 235 240 

Glu Ala Gin Val Gin Phe Ala Leu Glu Arg Gly Val Pro Ala He He 
245 250 255 

Ala Val Leu Gly Ser He Leu Leu Pro Tyr Pro Ala Arg Ala Phe Asp 
260 265 270 

Met Ala Gin Cys Ser Arg Cys Leu He Pro Trp Thr Ala Asn Glu Gly 
275 280 285 

Thr Tyr Leu Met Glu Val Asp Arg Val Leu Arg Pro Gly Gly Tyr Trp 
290 295 300 

Val Leu Ser Gly Pro Pro He Asn Trp Lys Thr Trp His Lys Thr Trp 
305 * 310 315 320 

Asn Arg Thr Lys Ala Glu Leu Asn Ala Glu Gin Lys Arg He Glu Gly 
325 330 335 

He Ala Glu Ser Leu Cys Trp Glu Lys Lys Tyr Glu Lys Gly Asp He 
340 345 350 

Ala He Phe Arg Lys Lys He Asn Asp Arg Ser Cys Asp Arg Ser Thr 
355 360 365 

Pro val Asp Thr Cys Lys Arg Lys Asp Thr Asp Asp Val Trp Tyr Lys 
370 375 380 

Glu He Glu Thr Cys Val Thr Pro Phe Pro Lys Val Ser Asn Glu Glu 
385 390 395 400 
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MBI-20 Sequence Listing. ST25 
Glu Val Ala Gly Gly Lys Leu Lys Lys Phe Pro Glu Arg Leu Phe Ala 
405 410 415 

Val Pro Pro Ser lie Ser Lys Gly Leu lie Asn Gly Val Asp Glu Glu 
420 425 430 

Ser Tyr Gin Glu Asp lie Asn Leu Trp Lys Lys Arg Val Thr Gly Tyr 
435 440 445 

Lys Arg lie Asn Arg Leu lie Gly Ser Thr Arg Tyr Arg Asn Val Met 
450 455 460 

Asp Met Asn Ala Gly Leu Gly Gly Phe Ala Ala Ala Leu Glu Ser Pro 
465 470 475 480 

Lys Ser Trp Val Met Asn Val lie Pro Thr lie Asn Lys Asn Thr Leu 
485 490 495 

Ser Val val Tyr Glu Arg Gly Leu He Gly He Tyr His Asp Trp Cys 
500 505 510 

Glu Gly Phe Ser Thr Tyr Pro Arg Thr Tyr Asp Phe He His Ala Ser 
515 520 525 

Gly Val Phe Ser Leu Tyr Gin His Ser Cys Lys Leu Glu Asp He Leu 
530 535 540 

Leu Glu Thr Asp Arg He Leu Arg Pro Glu Gly He Val He Phe Arg 
545 550 555 560 

Asp Glu Val Asp Val Leu Asn Asp Val Arg Lys He Val Asp Gly Met 
565 570 575 

Arg Trp Asp Thr Lys Leu Met Asp His Glu Asp Gly Pro Leu Val Pro 
580 585 590 

Glu Lys He Leu Val Ala Thr Lys Gin Tyr Trp Val Ala Gly Asp Asp 
595 600 605 

Gly Asn Asn Ser Pro Ser Ser Ser Asn Ser Glu Glu Glu 
610 615 620 

<210> 7 

<211> 1764 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (1)..(1764) 

<223> G307 

<400> 7 

atg aag aga gat cat cac caa ttc caa ggt cga ttg tec aac cac ggg 48 
Met Lys Arg Asp His His Gin Phe Gin Gly Arg Leu Ser Asn His Gly 
15 10 15 

act tct tct tct tea tea tea ate tct aaa gat aag atg atg atg gtg 96 
Thr Ser Ser Ser Ser Ser Ser He Ser Lys Asp Lys Met Met Met Val 
20 25 30 
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aaa aaa gaa gaa gac ggt gga ggt aac atg gac gac gag ctt etc get 144 
Lys Lys Glu Glu Asp Gly Gly Gly Asn Met Asp Asp Glu Leu Leu Ala 
35 40 45 



gtt tta ggt tac aaa gtt agg tea teg gag atg gcg gag gtt get ttg 

Val Leu Gly Tyr Lys Val Arg Ser Ser Glu Met Ala Glu Val Ala Leu 
50 55 60 

aaa etc gaa caa tta gag acg atg atg agt aat gtt caa gaa gat ggt 

Lys Leu Glu Gin Leu Glu Thr Met Met Ser Asn Val Gin Glu Asp Gly 
65 70 75 80 

tta tct cat etc gcg acg gat act gtt cat tat aat ccg teg gag ctt 

Leu Ser His Leu Ala Thr Asp Thr Val His Tyr Asn Pro Ser Glu Leu 
B5 90 95 



teg act teg acg ggt acg cag att ggt gga gtc ata gga acg acg gtg 
Ser Thr Ser Thr Gly Thr Gin He Gly Gly Val He Gly Thr Thr Val 
180 185 190 



192 



240 



288 



tat tct tgg ctt gat aat atg etc tct gag ctt aat cct cct cct ctt 336 
Tyr Ser Trp Leu Asp Asn Met Leu Ser Glu Leu Asn Pro Pro Pro Leu 
100 105 HO 

ccg gcg agt tct aac ggt tta gat ccg gtt ctt cct teg ccg gag att 384 
Pro Ala Ser Ser Asn Gly Leu Asp Pro Val Leu Pro Ser Pro Glu He 
115 120 125 

tgt ggt ttt ccg get teg gat tat gac ctt aaa gtc att ccc gga aac 432 
Cys Gly Phe Pro Ala Ser Asp Tyr Asp Leu Lys Val He Pro Gly Asn 
130 135 140 

gcg att tat cag ttt ccg gcg att gat tct teg tct teg teg aat aat 480 
Ala He Tyr Gin Phe Pro Ala He Asp Ser Ser Ser Ser Ser Asn Asn 
145 150 155 160 

cag aac aag cgt ttg aaa tea tgc teg agt cct gat tct atg gtt aca 528 
Gin Asn Lys Arg Leu Lys Ser Cys Ser Ser Pro Asp Ser Met Val Thr 
165 170 175 



576 



acg aca ace acc acg aca acg acg gcg gcg get gag tea act cgt tct 624 
Thr Thr Thr Thr Thr Thr Thr Thr Ala Ala Ala Glu Ser Thr Arg Ser 
195 200 205 

gtt ate ctg gtt gac teg caa gag aac ggt gtt cgt tta gtc cac gcg 672 
Val He Leu Val Asp Ser Gin Glu Asn Gly Val Arg Leu Val His Ala 
210 215 220 

ctt atg get tgt gca gaa gca ate cag cag aac aat ttg act eta gcg 720 
Leu Met Ala Cys Ala Glu Ala He Gin Gin Asn Asn Leu Thr Leu Ala 
225 230 235 240 

gaa get ctt gtg aag caa ate gga tgc tta get gtg tct caa gec gga 768 
Glu Ala Leu Val Lys Gin He Gly Cys Leu Ala Val Ser Gin Ala Gly 
245 250 255 

get atg aga aaa gtg get act tac ttc gee gaa get tta get egg egg 816 
Ala Met Arg Lys Val Ala Thr Tyr Phe Ala Glu Ala Leu Ala Arg Arg 
260 265 270 

ate tac cgt etc tct ccg ccg cag aat cag ate gat cat tgt etc tec 864 
He Tyr Arg Leu Ser Pro Pro Gin Asn Gin He Asp His Cys Leu Ser 
275 280 285 

gat act ctt cag atg cac ttt tac gag act tgt cct tat ctt aaa ttc 912 
Asp Thr Leu Gin Met His Phe Tyr Glu Thr Cys Pro Tyr Leu Lys Phe 
290 295 300 

get cac ttc acg gcg aac caa gcg att etc gaa get ttt gaa ggt aag 
Ala His Phe Thr Ala Asn Gin Ala He Leu Glu Ala Phe Glu Gly Lys 
305 310 315 320 

aag aga gta cac gtc att gat ttc teg atg aac caa ggt ctt caa tgg 1008 
Lys Arg Val His Val He Asp Phe Ser Met Asn Gin Gly Leu Gin Trp 
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MBI-20 Sequence Listing. ST25 
325 330 335 

cct gcg ctt atg caa get ctt gcg ctt cga gaa gga ggt cct cca act 1056 
Pro Ala Leu Met Gin Ala Leu Ala Leu Arg Glu Gly Gly Pro Pro Thr 
340 345 350 

ttc egg tta ace gga att ggt cca ccg gcg ccg gat aat tct gat cat 1104 
Phe Arg Leu Thr Gly He Gly Pro Pro Ala Pro Asp Asn Ser Asp His 
355 360 365 

ctt cat gaa gtt ggt tgt aaa tta get cag ctt gcg gag gcg att cac 1152 
Leu His Glu Val Gly Cys Lys Leu Ala Gin Leu Ala Glu Ala He His 
370 375 380 

gta gaa ttc gaa tac cgt gga ttc gtt get aac age tta gee gat etc 1200 
Val Glu Phe Glu Tyr Arg Gly Phe Val Ala Asn Ser Leu Ala Asp Leu 
385 390 395 400 

gat get teg atg ctt gag ctt aga ccg age gat acg gaa get gtt gcg 1248 
Asp Ala Ser Met Leu Glu Leu Arg Pro Ser Asp Thr Glu Ala Val Ala 
405 " 410 415 

gtg aac tct gtt ttt gag eta cat aag etc tta ggt cgt ccc ggt ggg 1296 
Val Asn Ser Val Phe Glu Leu His Lys Leu Leu Gly Arg Pro Gly Gly 
420 425 430 

ata gag aaa gtt etc ggc gtt gtg aaa cag att aaa ccg gtg att ttc 134 4 
He Glu Lys Val Leu Gly Val Val Lys Gin He Lys Pro Val He Phe 
435 440 445 

acg gtg gtt gag caa gaa teg aac cat aac gga ccg gtt ttc tta gac 1392 
Thr Val Val Glu Gin Glu Ser Asn His Asn Gly Pro Val Phe Leu Asp 
450 455 460 

egg ttt act gaa teg tta cat tat tat teg act ctg ttt gat teg ttg 1440 
Arg Phe Thr Glu Ser Leu His Tyr Tyr Ser Thr Leu Phe Asp Ser Leu 
465 470 475 480 

gaa gga gtt ccg aat agt caa gac aaa gtc atg tct gaa gtt tac tta 1488 
Glu Gly Val Pro Asn Ser Gin Asp Lys Val Met Ser Glu Val Tyr Leu 
485 490 495 

ggg aaa cag att tgt aat ctg gtg get tgt gaa ggt cct gac aga gtc 1536 
Gly Lys Gin He Cys Asn Leu Val Ala Cys Glu Gly Pro Asp Arg Val 
500 505 510 

gag aga cac gaa acg ttg agt caa tgg gga aac egg ttt ggt teg tec 1584 
Glu Arg His Glu Thr Leu Ser Gin Trp Gly Asn Arg Phe Gly Ser Ser 
515 520 525 

ggt tta gcg ccg gca cat ctt ggg tct aac gcg ttt aag caa gcg agt 1632 
Gly Leu Ala Pro Ala His Leu Gly Ser Asn Ala Phe Lys Gin Ala Ser 
530 535 540 

atg ctt ttg tct gtg ttt aat agt ggc caa ggt tat cgt gtg gag gag 1680 
Met Leu Leu Ser Val Phe Asn Ser Gly Gin Gly Tyr Arg Val Glu Glu 
545 550 555 560 

agt aat gga tgt ttg atg ttg ggt tgg cac act cgc cca etc att ace 1728 
Ser Asn Gly Cys Leu Met Leu Gly Trp His Thr Arg Pro Leu He Thr 
565 570 575 

ace tec get tgg aaa etc teg acg gcg gcg cac tga 1764 
Thr Ser Ala Trp Lys Leu Ser Thr Ala Ala His 
580 585 

<210> 8 
<211> 587 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 8 

Met Lys Arg Asp His Hie Gin Phe Gin Gly Arg Leu Ser Asn His Gly 
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MBI-20 Sequence Listing. ST25 

10 15 



Thr Ser Ser Ser Ser Ser Ser lie Ser Lys Asp Lys Met Met Met Val 
20 25 30 



Lys Lys Glu Glu Asp Gly Gly Gly Asn Met Asp Asp Glu Leu Leu Ala 
35 40 45 



Val Leu Gly Tyr Lys Val Arg Ser Ser Glu Met Ala Glu Val Ala Leu 
50 55 60 



Lys Leu Glu Gin Leu Glu Thr Met Met Ser Asn Val Gin Glu Asp Gly 
65 70 75 80 



Leu Ser His Leu Ala Thr Asp Thr Val His Tyr Asn Pro Ser Glu Leu 
85 90 95 



Tyr Ser Trp Leu Asp Asn Met Leu Ser Glu Leu Asn Pro Pro Pro Leu 
100 105 110 



Pro Ala Ser Ser Asn Gly Leu Asp Pro Val Leu Pro Ser Pro Glu He 
115 120 125 



Cys Gly Phe Pro Ala Ser Asp Tyr Asp Leu Lys Val He Pro Gly Asn 
130 135 140 



Ala He Tyr Gin Phe Pro Ala He Asp Ser Ser Ser Ser Ser Asn Asn 
145 150 155 160 



Gin Asn Lys Arg Leu Lys Ser Cys Ser Ser Pro Asp Ser Met Val Thr 
165 170 175 



Ser Thr Ser Thr Gly Thr Gin He Gly Gly Val He Gly Thr Thr Val 
180 185 190 

Thr Thr Thr Thr Thr Thr Thr Thr Ala Ala Ala Glu Ser Thr Arg Ser 

195 200 205 



val He Leu Val Asp Ser Gin Glu Asn Gly Val Arg Leu Val His Ala 
210 215 220 



Leu Met Ala Cys Ala Glu Ala He Gin Gin Asn Asn Leu Thr Leu Ala 
225 230 235 240 



Glu Ala Leu Val Lys Gin He Gly Cys Leu Ala Val Ser Gin Ala Gly 
245 250 255 

Ala Met Arg Lys val Ala Thr Tyr Phe Ala Glu Ala Leu Ala Arg Arg 
260 265 270 



He Tyr Arg Leu Ser Pro Pro Gin Asn Gin He Asp His Cys Leu Ser 
275 280 285 



Asp Thr Leu Gin Met His Phe Tyr Glu Thr Cys Pro Tyr Leu Lys Phe 
290 295 300 
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Ala His Phe Thr Ala Asn Gin Ala He Leu Glu Ala Phe Glu Gly Lys 
305 310 315 320 



Lys Arg Val His Val He Asp Phe Ser Met Asn Gin Gly Leu Gin Trp 
325 330 335 



Pro Ala Leu Met Gin Ala Leu Ala Leu Arg Glu Gly Gly Pro Pro Thr 
340 345 350 



Phe Arg Leu Thr Gly He Gly Pro Pro Ala Pro Asp Asn Ser Asp His 
355 360 365 



Leu His Glu Val Gly Cys Lys Leu Ala Gin Leu Ala Glu Ala He His 
370 375 380 



Val Glu Phe Glu Tyr Arg Gly Phe Val Ala Asn Ser Leu Ala Asp Leu 
385 390 395 400 



Asp Ala Ser Met Leu Glu Leu Arg Pro Ser Asp Thr Glu Ala Val Ala 
405 410 415 



Val Asn Ser Val Phe Glu Leu His Lys Leu Leu Gly Arg Pro Gly Gly 
420 425 430 



He Glu Lys Val Leu Gly Val Val Lys Gin He Lys Pro Val He Phe 
435 440 445 . 

Thr Val Val Glu Gin Glu Ser Asn His Asn Gly Pro Val Phe Leu Asp 
450 455 460 



Arg Phe Thr Glu Ser Leu His Tyr Tyr Ser Thr Leu Phe Asp Ser Leu 
465 470 475 480 



Glu Gly Val Pro Asn Ser Gin Asp Lys Val Met Ser Glu Val Tyr Leu 
465 490 495 



Gly Lys Gin He Cys Asn Leu Val Ala Cys Glu Gly Pro Asp Arg Val 
500 505 510 



Glu Arg His Glu Thr Leu Ser Gin Trp Gly Asn Arg Phe Gly Ser Ser 
515 520 525 



Gly Leu Ala Pro Ala His Leu Gly Ser Asn Ala Phe Lys Gin Ala Ser 
530 535 540 



Met Leu Leu Ser val Phe Asn Ser Gly Gin Gly Tyr Arg Val Glu Glu 
545 550 555 560 



Ser Asn Gly Cys Leu Met Leu Gly Trp His Thr Arg Pro Leu lie Thr 
565 570 575 



Thr Ser Ala Trp Lys Leu Ser Thr Ala Ala His 
580 585 



<210> 9 
<211> 825 
<212> DNA 
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<213> Arabidopsis thaliana 
<22o> 

<221> CDS 

<222> (1)..(82S) 

<223> G346 

<400> 9 

atg gaa atg gaa tea ttc atg gac gac ctt ttg aac ttc tct gta ccg 48 

Met Glu Met Glu Ser Phe Met Asp Asp Leu Leu Asn Phe Ser Val Pro 
15 10 15 

gaa gag gaa gaa gac gac gac gaa cat acg caa cca ccg agg aat att 96 
Glu Glu Glu Glu Asp Asp Asp Glu His Thr Gin Pro Pro Arg Asn lie 
20 25 30 

act cgc egg aaa act gga tta egg cca aca gac tec ttc ggt etc ttt 144 
Thr Arg Arg Lys Thr Gly Leu Arg Pro Thr Asp Ser Phe Gly Leu Phe 
35 40 45 

aat ace gac gac ctt gga gtg gtt gaa gaa gag gat ttg gaa tgg att 192 
Asn Thr Asp Asp Leu Gly Val Val Glu Glu Glu Asp Leu Glu Trp lie 
50 55 60 

tea aac aaa aat get ttt ccg gtg att gaa aca ttc gtc ggt gta tta 240 
Ser Asn Lys Asn Ala Phe Pro Val He Glu Thr Phe Val Gly Val Leu 
65 - 70 75 80 

ccg teg gag cat ttt cct ata acg tct ctt ctg gaa aga gaa gcg act 288 
Pro Ser Glu His Phe Pro He Thr Ser Leu Leu Glu Arg Glu Ala Thr 
85 90 95 

gag gta aaa cag ctg agt ccg gtt tea gta ctt gag acg agt age cat 336 
Glu Val Lys Gin Leu Ser Pro Val Ser Val Leu Glu Thr Ser Ser His 
100 105 110 

age tec aca acg act acc tea aac agt age ggc gga agt aac gga age 384 
Ser Ser Thr Thr Thr Thr Ser Asn Ser Ser Gly Gly Ser Asn Gly Ser 
115 120 125 

acg gec gtg get acg acc acc acc act cca aca ata atg age tgt tgc 432 
Thr Ala Val Ala Thr Thr Thr Thr Thr Pro Thr He Met Ser Cys Cys 
130 135 140 

gtt ggt ttt aaa gcg ccg get aaa gcg aga age aag cgt cgt cgt aca 480 
Val Gly Phe Lys Ala Pro Ala Lys Ala Arg Ser Lys Arg Arg Arg Thr 
145 150 155 160 

gga cgc cgt gat tta cga gtt ttg tgg aca gga aac gag caa gga gga 528 
Gly Arg Arg Asp Leu Arg Val Leu Trp Thr Gly Asn Glu Gin Gly Gly 
165 170 175 

ata cag aag aag aag acg atg act gtg gcg gcg get gcg ttg att atg 576 
lie Gin Lys Lys Lys Thr Met Thr Val Ala Ala Ala Ala Leu He Met 
180 185 190 

gga agg aag tgt caa cac tgt gga gcg gag aag act ccg caa tgg agg 624 
Gly Arg Lys Cys Gin His Cys Gly Ala Glu Lys Thr Pro Gin Trp Arg 
195 200 205 

gca gga cca gcg ggg cct aag act ctg tgt aac get tgt ggc gtg agg 672 
Ala Gly Pro Ala Gly Pro Lys Thr Leu Cys Asn Ala Cys Gly Val Arg 
210 215 220 

tat aag tec ggg agg eta gtt ccg gag tat cgt cca gcg aac agt cca 720 
Tyr Lys Ser Gly Arg Leu Val Pro Glu Tyr Arg Pro Ala Asn Ser Pro 
225 230 235 240 

act ttc acg gcg gag tta cat teg aat tct cac egg aag att gta gag 768 
Thr Phe Thr Ala Glu Leu His Ser Asn Ser His Arg Lys He Val Glu 
245 250 255 

atg agg aag cag tat cag tec ggt gac ggt gac ggt gat egg aaa gat 816 
Met Arg Lys Gin Tyr Gin Ser Gly Asp Gly Asp Gly Asp Arg Lys Asp 
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260 265 270 

tgt gga taa 825 
Cys Gly 



<210> 10 
<211> 274 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 10 

Met Glu Met Glu Ser Phe Met Asp Asp Leu Leu Asn Phe Ser Val Pro 
15 10 15 



Glu Glu Glu Glu Asp Asp Asp Glu His Thr Gin Pro Pro Arg Asn lie 
20 25 30 



Thr Arg Arg Lys Thr Gly Leu Arg Pro Thr Asp Ser Phe Gly Leu Phe 
35 40 45 



Asn Thr Asp Asp Leu Gly Val Val Glu Glu Glu Asp Leu Glu Trp lie 
50 55 60 



Ser Asn Lya Asn Ala Phe Pro Val lie Glu Thr Phe Val Gly Val Leu 
65 70 75 80 



Pro Ser Glu His Phe Pro lie Thr Ser Leu Leu Glu Arg Glu Ala Thr 
85 90 95 



Glu Val Lys Gin Leu Ser Pro Val Ser Val Leu Glu Thr Ser Ser His 
100 105 110 



Ser Ser Thr Thr Thr Thr Ser Asn Ser Ser Gly Gly Ser Asn Gly Ser 
115 120 125 



Thr Ala Val Ala Thr Thr Thr Thr Thr Pro Thr lie Met Ser Cys Cys 
130 135 140 



Val Gly Phe Lys Ala Pro Ala Lys Ala Arg Ser Lys Arg Arg Arg Thr 
145 150 155 160 

Gly Arg Arg Asp Leu Arg Val Leu Trp Thr Gly Asn Glu Gin Gly Gly 
165 170 175 

He Gin Lys Lys Lys Thr Met Thr Val Ala Ala Ala Ala Leu He Met 
180 185 190 

Gly Arg Lys Cys Gin His Cys Gly Ala Glu Lys Thr Pro Gin Trp Arg 
195 200 205 



Ala Gly Pro Ala Gly Pro Lys Thr Leu Cys Asn Ala Cys Gly Val Arg 
210 215 220 



Tyr Lys Ser Gly Arg Leu Val Pro Glu Tyr Arg Pro Ala Asn Ser Pro 
225 " 230 235 240 



Thr Phe Thr Ala Glu Leu His Ser Asn Ser His Arg Lys He Val Glu 
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245 250 255 

Met Arg Lys Gin Tyr Gin Ser Gly Asp Gly Asp Gly Asp Arg Lys Asp 
260 265 270 



Cys Gly 



<210> 


11 


<211> 


1226 


<212> 


DNA 


<213> 


Arabidopsis t ha liana 


<220> 




<221> 


CDS 


<222> 


(248) . . (1039) 


<223> 


G598 


<400> 


11 



^ws^v^ww Jt tcttgagaat tccacatttt tatccttttt 60 

gtcatgtagt gtatattttt tcctctaacc taattaaaat caaaacaaaa tcctttgacc 120 

caattagctt cgcgatatat cagaagagat caaactactt tgatcagacc atgatcttct 180 

tcttcttctt cttcttcttc ttcttctttt tagacgatca caattcctaa accctatttc 240 

tcagatt atg ctg act ctt tac cat caa gaa agg tea ccg gac gec aca 289 
Met Leu Thr Leu Tyr His Gin Glu Arg Ser Pro Asp Ala Thr 
15 10 

agt aat gat cgc gat gag acg cca gag act gtg gtt aga gaa gtc cac 337 
Ser Asn Asp Arg Asp Glu Thr Pro Glu Thr Val Val Arg Glu Val His 
15 * 20 25 30 

gcg eta act cca gcg ccg gag gat aat tec egg acg atg acg gcg acg 385 
Ala Leu Thr Pro Ala Pro Glu Asp Asn Ser Arg Thr Met Thr Ala Thr 
35 40 45 

eta cct cca ccg cct get ttc cga ggc tat ttt tct cct cca agg tea 433 
Leu Pro Pro Pro Pro Ala Phe Arg Gly Tyr Phe Ser Pro Pro Arg Ser 
50 55 60 

gcg acg acg atg age gaa gga gag aac ttc aca act ata age aga gag 481 
Ala Thr Thr Met Ser Glu Gly Glu Asn Phe Thr Thr He Ser Arg Glu 
65 70 75 

ttc aac get eta gtc ate gee gga tec tec atg gag aac aac gaa eta 529 
Phe Asn Ala Leu Val He Ala Gly Ser Ser Met Glu Asn Asn Glu Leu 
80 85 90 

atg act cgt gac gtc acg cag cgt gaa gat gag aga caa gac gag ttg 577 
Met Thr Arg Asp Val Thr Gin Arg Glu Asp Glu Arg Gin Asp Glu Leu 
95 100 105 HO 

atg aga ate cac gag gac acg gat cat gaa gag gaa acg aat cct tta 625 
Met Arg He His Glu Asp Thr Asp His Glu Glu Glu Thr Asn Pro Leu 
115 120 125 

gca ate gtg ccg gat cag tat cct ggt teg ggt ttg gat cct gga agt 673 
Ala lie Val Pro Asp Gin Tyr Pro Gly Ser Gly Leu Asp Pro Gly Ser 
130 135 140 

gat aat ggg ccg ggt cag agt egg gtt ggg teg acg gtg caa aga gtt 721 
Asp Asn Gly Pro Gly Gin Ser Arg Val Gly Ser Thr Val Gin Arg Val 
145 150 155 

aag agg gaa gag gtg gaa gcg aag ata acg gcg tgg cag acg gca aaa 769 
Lys Arg Glu Glu Val Glu Ala Lys He Thr Ala Trp Gin Thr Ala Lys 
160 165 170 
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ctg get aag att aat aac agg ttt aag agg gaa gac gec gtt att aac 817 
Leu Ala Lys He Asn Asn Arg Phe Lys Arg Glu Asp Ala Val He Asn 

175 180 185 190 

ggt tgg ttt aat gaa caa gtt aac aag gec aac tct tgg atg aag aaa 865 
Gly Trp Phe Asn Glu Gin Val Asn Lys Ala Asn Ser Trp Met Lys Lys 
195 200 205 

att gag tat aat gta ggt tea ttc aac aat cgt eta aat gag gaa get 913 
He Glu Tyr Asn Val Gly Ser Phe Asn Asn Arg Leu Asn Glu Glu Ala 
210 215 220 

aga gga gag aaa age aaa age gat gga gaa aac gca aaa caa tgt ggc 961 
Arg Gly Glu Lys Ser Lys Ser Asp Gly Glu Asn Ala Lys Gin Cys Gly 
225 230 235 

gaa age gca gag gaa age gga gga gag aag age gac ggc aga ggc aaa 1009 
Glu Ser Ala Glu Glu Ser Gly Gly Glu Lys Ser Asp Gly Arg Gly Lys 
240 245 250 

gag agg gac aga ggt tgc aaa agt agt tga agttgctaat ctcatgagag 1059 
Glu Arg Asp Arg Gly Cys Lys Ser Ser 
255 260 

cccttggacg tcctcctgcc aaacgctcct tcttctcttt ctcctaattt ttagttatat 1119 

caaaccatta aattaaacag tactegttat atatctagtt agtaaacaaa ggggcagttt 1179 

tatagctcat gtacacataa ttgagagtgt agtactgttg tgtcaaa 1226 

<210> 12 
<211> 263 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 12 

Met Leu Thr Leu Tyr His Gin Glu Arg Ser Pro Asp Ala Thr Ser Asn 
15 10 15 

Asp Arg Asp Glu Thr Pro Glu Thr Val Val Arg Glu Val His Ala Leu 
20 25 30 

Thr Pro Ala Pro Glu Asp Asn Ser Arg Thr Met Thr Ala Thr Leu Pro 
35 40 45 

Pro Pro Pro Ala Phe Arg Gly Tyr Phe Ser Pro Pro Arg Ser Ala Thr 
50 55 60 

Thr Met Ser Glu Gly Glu Asn Phe Thr Thr He Ser Arg Glu Phe Asn 
65 70 75 80 

Ala Leu Val He Ala Gly Ser Ser Met Glu Asn Asn Glu Leu Met Thr 
85 90 95 

Arg Asp Val Thr Gin Arg Glu Asp Glu Arg Gin Asp Glu Leu Met Arg 
100 105 110 

lie His Glu Asp Thr Asp His Glu Glu Glu Thr Asn Pro Leu Ala He 
115 120 125 

Val Pro Asp Gin Tyr Pro Gly Ser Gly Leu Asp Pro Gly Ser Asp Asn 
130 135 140 



Gly Pro Gly Gin Ser Arg Val Gly Ser Thr Val Gin Arg Val Lys Arg 
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145 150 155 160 

Glu Glu Val Glu Ala Lys lie Thr Ala Trp Gin Thr Ala Lys Leu Ala 
165 170 175 



Lys lie Asn Asn Arg Phe Lys Arg Glu Asp Ala Val He Asn Gly Trp 
180 185 190 



Phe Asn Glu Gin Val Asn Lys Ala Asn Ser Trp Met Lys Lys He Glu 
195 200 205 



Tyr Asn Val Gly Ser Phe Asn Asn Arg Leu Asn Glu Glu Ala Arg Gly 
210 * 215 220 

Glu Lys Ser Lys Ser Asp Gly Glu Asn Ala Lys Gin Cys Gly Glu Ser 
225 230 235 240 



Ala Glu Glu Ser Gly Gly Glu Lys Ser Asp Gly Arg Gly Lys Glu Arg 
245 " 250 255 



Asp Arg Gly Cys Lys Ser Ser 
260 



<210> 


13 


<211> 


1263 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(72) . . (1076) 


<223> 


G605 


<400> 


13 



aattccatcc taataatttt caaagcttta attctaagaa ataatatcta caagaaaata 60 

ttatctcatg t atg gag act acc gga gaa gtt gtt aaa aca acc acc ggg 110 
Met Glu Thr Thr Gly Glu Val Val Lys Thr Thr Thr Gly 
15 10 

age gac gga ggc gtt acg gtg gtg aga tec aac gcg ccg tea gac ttc 158 
Ser Asp Gly Gly Val Thr Val Val Arg Ser Asn Ala Pro Ser Asp Phe 
15 20 25 

cac atg get ccg agg tea gaa act tea aac aca cct ccc aac tec gtc 206 
His Met Ala Pro Arg Ser Glu Thr Ser Asn Thr Pro Pro Asn Ser Val 
30 35 40 45 

get cct cct cct cct cca ccg ccg caa aac tec ttt act ccg teg gcg 254 
Ala Pro Pro Pro Pro Pro Pro Pro Gin Asn Ser Phe Thr Pro Ser Ala 
50 55 60 

get atg gat ggt ttc tea age gga ccg ata aag aag aga cgt ggg cgc 302 
Ala Met Asp Gly Phe Ser Ser Gly Pro He Lys Lys Arg Arg Gly Arg 
65 70 75 

cct agg aag tac gga cac gac gga gca gcg gtg acg eta tct ccg aat 350 
Pro Arg Lys Tyr Gly His Asp Gly Ala Ala Val Thr Leu Ser Pro Asn 
80 85 90 

ccg ata tea tea gee gca cca acg act tct cac gtc ate gat ttc teg 398 
Pro He Ser Ser Ala Ala Pro Thr Thr Ser His Val He Asp Phe Ser 
95 100 105 

acg aca teg gag aaa cgt ggc aaa atg aaa cca gca act cca act cca 446 
Thr Thr Ser Glu Lys Arg Gly Lys Met Lys Pro Ala Thr Pro Thr Pro 
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110 115 120 125 

age tea ttc ate agg cca aag tac cag gtc gag aat tta ggt gaa tgg 494 
Ser Ser Phe He Arg Pro Lys Tyr Gin Val Glu Asn Leu Gly Glu Trp 
130 135 140 

tct cct tec tct gec gee get aat ttc acg ccg cat att att acg gtg 542 
Ser Pro Ser Ser Ala Ala Ala Asn Phe Thr Pro His He He Thr Val 
145 150 155 

aat gca ggc gag gac gtt acg aag agg ata ata tea ttt tct caa caa 590 
Asn Ala Gly Glu Asp Val Thr Lys Arg He He Ser Phe Ser Gin Gin 
160 165 170 

ggg tct eta get att tgc gtt tta tgc gca aac ggt gtc gtt teg age 638 
Gly Ser Leu Ala He Cys Val Leu Cys Ala Asn Gly Val Val Ser Ser 
175 180 185 

gtt aca ctt cgt cag cct gat tea tct ggt ggt aca ttg ace tat gag 686 
Val Thr Leu Arg Gin Pro Asp Ser Ser Gly Gly Thr Leu Thr Tyr Glu 
190 195 200 205 

99t c 99 ttt gag ata ttg tea eta tct gga aca ttc atg cct agt gac 734 
Gly Arg Phe Glu He Leu Ser Leu Ser Gly Thr Phe Met Pro Ser Asp 
210 215 220 

tea gac ggg aca cga age aga aca ggc ggg atg age gtg teg ctt get 782 
Ser Asp Gly Thr Arg Ser Arg Thr Gly Gly Met Ser Val Ser Leu Ala 
225 230 235 

age cct gat gga cgt gta gta ggt ggt ggt gtt get ggc ttg ctg gtt 830 
Ser Pro Asp Gly Arg Val Val Gly Gly Gly Val Ala Gly Leu Leu Val 
240 245 250 

gca gee act cct att caa gtg gtt gta gga act ttc tta ggt gga aca 878 
Ala Ala Thr Pro He Gin Val Val Val Gly Thr Phe Leu Gly Gly Thr 
255 260 265 

aac cag caa gaa cag aca ccg aag ccg cat aac cac aac ttc atg tct 926 
Asn Gin Gin Glu Gin Thr Pro Lys Pro His Asn His Asn Phe Met Ser 
270 275 280 285 

tct cca tta atg cca act tct teg aat gta get gat cat cga ace ate 974 
Ser Pro Leu Met Pro Thr Ser Ser Asn Val Ala Asp His Arg Thr He 
290 295 300 

cgt ccc atg aca tct agt etc ccg ate agt aca tgg aca ccg tct ttt 1022 
Arg Pro Met Thr Ser Ser Leu Pro He Ser Thr Trp Thr Pro Ser Phe 
305 310 315 

cct tct gat tea cga cac aag cat tct cat gac ttt aat ate act ttg 1070 
Pro Ser Asp Ser Arg His Lys His Ser His Asp Phe Asn He Thr Leu 
320 ■ 325 330 

acg tga tttcttcctt gaagaactcg tagatcctct gtattttggt ttccagttta 1126 
Thr 

gggctctaca tgttagactc tcaaagtcta ggtgttatgt tggtctgtca cttaggattg 1186 

tcacttagga ttgttagacc atctccatca atggtttctc attgagaaac tgttcaatat 124 6 

aaaaataaaa tataatc 1263 

<210> 14 

<211> 334 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 14 

Met Glu Thr Thr Gly Glu Val Val Lys Thr Thr Thr Gly Ser Asp Gly 
1 5 10 15 
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Gly Val Thr Val Val Arg Ser Asn Ala Pro Ser Asp Phe His Met Ala 
20 25 30 



Pro Arg Ser Glu Thr Ser Asn Thr Pro Pro Asn Ser Val Ala Pro Pro 
35 40 45 



Pro Pro Pro Pro Pro Gin Asn Ser Phe Thr Pro Ser Ala Ala Met Asp 
50 55 60 



Gly Phe Ser Ser Gly Pro lie Lys Lys Arg Arg Gly Arg Pro Arg Lys 
65 70 75 80 



Tyr Gly His Asp Gly Ala Ala Val Thr Leu Ser Pro Asn Pro He Ser 
85 90 95 



Ser Ala Ala Pro Thr Thr Ser His Val He Asp Phe Ser Thr Thr Ser 
100 105 110 



Glu Lys Arg Gly Lys Met Lys Pro Ala Thr Pro Thr Pro Ser Ser Phe 
115 120 125 



He Arg Pro Lys Tyr Gin Val Glu Asn Leu Gly Glu Trp Ser Pro Ser 
130 135 140 



Ser Ala Ala Ala Asn Phe Thr Pro His lie He Thr Val Asn Ala Gly 
145 150 155 160 

Glu Asp Val Thr Lys Arg He He Ser Phe Ser Gin Gin Gly Ser Leu 
165 170 175 



Ala He Cys Val Leu Cys Ala Asn Gly Val Val Ser Ser Val Thr Leu 
180 185 190 

Arg Gin Pro Asp Ser Ser Gly Gly Thr Leu Thr Tyr Glu Gly Arg Phe 
195 200 205 

Glu He Leu Ser Leu Ser Gly Thr Phe Met Pro Ser Asp Ser Asp Gly 
210 215 220 

Thr Arg Ser Arg Thr Gly Gly Met Ser Val Ser Leu Ala Ser Pro Asp 
225 230 235 240 

Gly Arg Val Val Gly Gly Gly Val Ala Gly Leu Leu Val Ala Ala Thr 
245 250 255 

Pro He Gin Val Val Val Gly Thr Phe Leu Gly Gly Thr Asn Gin Gin 
260 '265 270 



Glu Gin Thr Pro Lys Pro His Asn His Asn Phe Met Ser Ser Pro Leu 
275 280 285 

Met Pro Thr Ser Ser Asn Val Ala Asp His Arg Thr He Arg Pro Met 
290 295 300 

Thr Ser Ser Leu Pro He Ser Thr Trp Thr Pro Ser Phe Pro Ser Asp 
305 310 315 320 
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Ser Arg His Lys His Ser His Asp Phe Asn lie Thr Leu Thr 
325 330 

<210> 15 
<211> 1057 
<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (54).. (914) 

<223> G777 

<400> 15 

gtggctctct ctttatcttt cttggagttt agttagagat tttaacgttg caa atg 56 

Met 
1 

gat caa cca atg aaa cca aaa act tgc tct gaa tct gat ttt get gat 104 
Asp Gin Pro Met Lys Pro Lys Thr Cys Ser Glu Ser Asp Phe Ala Asp 
5 10 15 

gat tec tct get tct tct tct tct tct teg gga caa aat etc aga gga 152 
Asp Ser Ser Ala Ser Ser Ser Ser Ser Ser Gly Gin Asn Leu Arg Gly 
20 25 30 

get gag atg gtg gtg gaa gtg aag aag gaa gca gtt tgt tec cag aaa 200 
Ala Glu Met Val Val Glu Val Lys Lys Glu Ala Val Cys Ser Gin Lys 
35 40 45 

gca gag cga gag aag ctt cgt aga gat aag ctt aag gaa cag ttt ctt 24 8 

Ala Glu Arg Glu Lys Leu Arg Arg Asp Lys Leu Lys Glu Gin Phe Leu 
50 55 60 65 

gag ctt gga aat gca ctt gat ccg aat agg cct aag agt gac aaa gee 296 
Glu Leu Gly Asn Ala Leu Asp Pro Asn Arg Pro Lys Ser Asp Lys Ala 
70 75 80 

tea gtt etc act gat aca ata caa atg etc aag gat gta atg aac caa 344 
Ser Val Leu Thr Asp Thr lie Gin Met Leu Lys Asp Val Met Asn Gin 
85 90 95 

gtt gat aga eta aaa get gag tat gaa aca eta tct caa gag tct cgt 392 
Val Asp Arg Leu Lys Ala Glu Tyr Glu Thr Leu Ser Gin Glu Ser Arg 
100 105 110 

gag eta att caa gag aag agt gag ctg aga gag gag aaa gcg act tta 44 0 

Glu Leu lie Gin Glu Lys Ser Glu Leu Arg Glu Glu Lys Ala Thr Leu 
115 120 125 

aag tct gat ate gag att ctt aat get caa tat cag cat aga ate aaa 488 
Lys Ser Asp lie Glu He Leu Asn Ala Gin Tyr Gin His Arg He Lys 
130 * 135 140 145 

ace atg gtt cca tgg gta cct cat tac agt tat cat ate ccc ttc gta 536 
Thr Met Val Pro Trp Val Pro His Tyr Ser Tyr His He Pro Phe Val 
150 155 160 

gec ata act cag ggt cag tec agt ttt ata cct tat tea gee tct gtc 584 
Ala He Thr Gin Gly Gin Ser Ser Phe He Pro Tyr Ser Ala Ser Val 
165 170 * 175 

aat cct eta acc gaa caa caa gca teg gtt cag cag cat tct tct tct 632 
Asn Pro Leu Thr Glu Gin Gin Ala Ser Val Gin Gin His Ser Ser Ser 
180 185 190 

tct gee gat get tea atg aaa caa gat tec aaa ate aag ccg tta gat 680 
Ser Ala Asp Ala Ser Met Lys Gin Asp Ser Lys He Lys Pro Leu Asp 
195 200 205 

ttg gat ctg atg atg aac agt aac cat tea ggt caa gga aat gat caa 728 
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Leu Asp Leu Met Met Asn Ser Asn His Ser Gly Gin Gly Asn Asp Gin 
210 215 220 225 

aaa gat gat gtt cgt tta aag etc gag ctt aaa ate cat gec tct tct 776 
Lys Asp Asp Val Arg Leu Lys Leu Glu Leu Lys He His Ala Ser Ser 
230 235 240 

tta get caa cag gat gtt tct gga aaa gag aag aaa gta age ttg aca 824 
Leu Ala Gin Gin Asp Val Ser Gly Lys Glu Lys Lys Val Ser Leu Thr 
245 250 255 

acc act gca age tea teg aat agt tac tea tta tct caa get gtt caa 872 
Thr Thr Ala Ser Ser Ser Abii Ser Tyr Ser Leu Ser Gin Ala Val Gin 
260 265 270 

gat agt tec ccc ggt acc gta aat gac atg ttg aag cca taa 914 
Asp Ser Ser Pro Gly Thr Val Asn Asp Met Leu Lys Pro 
275 280 285 

accaataaac atattcccct gaacttgtgt ttaataccgt gattgagaag gtaccatgat 974 

taaacttgtt gtagattatc cacatgatta acgatgtatt cttatcacaa gcaaataaaa 1034 

cacaaaagca tttgcttaaa aaa 1057 

<210> 16 
<211> 286 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 16 

Met Asp Gin Pro Met Lys Pro Lys Thr Cys Ser Glu Ser Asp Phe Ala 
1 4 5 10 15 

Asp Asp Ser Ser Ala Ser Ser Ser Ser Ser Ser Gly Gin Asn Leu Arg 
20 25 30 

Gly Ala Glu Met Val Val Glu Val Lys Lys Glu Ala Val Cys Ser Gin 
35 40 45 

Lys Ala Glu Arg Glu Lys Leu Arg Arg Asp Lys Leu Lys Glu Gin Phe 
50 55 60 

Leu Glu Leu Gly Asn Ala Leu Asp Pro Asn Arg Pro Lys Ser Asp Lys 
65 70 75 80 

Ala Ser Val Leu Thr Asp Thr He Gin Met Leu Lys Asp Val Met Asn 
85 90 95 

Gin Val Asp Arg Leu Lys Ala Glu Tyr Glu Thr Leu Ser Gin Glu Ser 
100 ' 105 HO 

Arg Glu Leu He Gin Glu Lys Ser Glu Leu Arg Glu Glu Lys Ala Thr 
115 120 125 

Leu Lys Ser Asp He Glu He Leu Asn Ala Gin Tyr Gin His Arg He 
130 135 140 

Lys Thr Met Val Pro Trp Val Pro His Tyr Ser Tyr His He Pro Phe 
145 150 155 160 

Val Ala He Thr Gin Gly Gin Ser Ser Phe He Pro Tyr Ser Ala Ser 
165 170 175 
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Val Asn Pro Leu Thr Glu Gin Gin Ala Ser Val Gin Gin His Ser Ser 
180 185 190 



Ser Ser Ala Asp Ala Ser Met Lys Gin Asp Ser Lys He Lys Pro Leu 
195 200 205 



Asp Leu Asp Leu Met Met Asn Ser Asn His Ser Gly Gin Gly Asn Asp 
210 215 220 

Gin Lys Asp Asp Val Arg Leu Lys Leu Glu Leu Lys He His Ala Ser 
225 230 235 240 



Ser Leu Ala Gin Gin Asp Val Ser Gly Lys Glu Lys Lys Val Ser Leu 
245 250 255 



Thr Thr Thr Ala Ser Ser Ser Asn Ser Tyr Ser Leu Ser Gin Ala Val 
260 265 270 



Gin Asp Ser Ser Pro Gly Thr Val Asn Asp Met Leu Lys Pro 
275 280 285 



<210> 


17 


<211> 


1571 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(428) . . (1402) 


<223> 


G869 


<400> 


17 



aggaacagtg aaaggttcgg ttttttgggt ttcgatctga taatcaacaa gaaaaaaggg 60 

tttgatttat gtcggctggg tttgaatcga ctgtgatttt gtctttgatt catatctctt 120 

ctccgatttc atcatcatct tccccatcat cgtcgtcttt gaaatcttgt cttctcaacg 180 

ctcttcactt ctgctgtaat aagcagaggc ttgttctgga gactccttct ctttccatgc 240 

gcttaagacc caaaaggact tgttctagtg ttgaagtctt tgggggtttt cacataaagc 300 

agcaaaagtt ttcttttttc atagttcgct gagagttttg agttttgata ccaaaaaagt 360 

tttgaccttt tagagtgatt ttttgttctt tctgttttct gggtattttt gaggagtggg 420 

tttaaca atg gtt gcg att aga aag gaa cag tct ttg agt ggt gtt agt 469 
Met Val Ala He Arg Lys Glu Gin Ser Leu Ser Gly Val Ser 
15 10 

age gag att aag aag aga get aag aga aac act eta teg tec ctt cct 517 
Ser Glu He Lys Lys Arg Ala Lys Arg Asn Thr Leu Ser Ser Leu Pro 
15 20 25 30 

caa gaa ace caa cct ttg agg aaa gtc cgt att att gtg aat gat cct 565 
Gin Glu Thr Gin Pro Leu Arg Lys Val Arg He He Val Asn Asp Pro 
35 40 '45 

tat get act gat gat tec tct agt gat gag gaa gag ctt aag gtt cct 613 
Tyr Ala Thr Asp Asp Ser Ser Ser Asp Glu Glu Glu Leu Lys Val Pro 
50 55 60 

aag cca agg aaa atg aaa cgt ate gtt cgt gag att aac ttt cct tct 661 
Lys Pro Arg Lys Met Lys Arg He Val Arg Glu He Asn Phe Pro Ser 
65 70 75 
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atg gaa gtt tct gaa cag cct tct gag agt tct tct cag gac agt act 709 
Met Glu Val Ser Glu Gin Pro Ser Glu Ser Ser Ser Gin Asp Ser Thr 
80 85 90 

aaa act gat ggc aag ata get gtg tea get tct cct get gtt cct agg 757 
Lys Thr Asp Gly Lys lie Ala Val Ser Ala Ser Pro Ala Val Pro Arg 
95 100 105 110 

aag aag cct gtt ggt gtt agg caa agg aaa tgg ggg aaa tgg get get 805 
Lys Lys Pro Val Gly Val Arg Gin Arg Lys Trp Gly Lys Trp Ala Ala 
115. 120 125 

gag att aga gat cct att aag aaa act agg act tgg ttg ggt act ttt 853 
Glu lie Arg Asp Pro lie Lys Lys Thr Arg Thr Trp Leu Gly Thr Phe 
130 135 140 

gat act ctt gaa gaa get get aaa get tat gat get aag aag ctt gag 901 
Asp Thr Leu Glu Glu Ala Ala Lys Ala Tyr Asp Ala Lys Lys Leu Glu 
145 150 155 

ttt gat get att gtt get gga aat gtg tec act act aaa cgt gat gtt 949 
Phe Asp Ala He Val Ala Gly Asn Val Ser Thr Thr Lys Arg Asp Val 
160 165 170 

tct tea tct gag act age caa tgc tct cgt tct tea cct gtt gtt cct 997 
Ser Ser Ser Glu Thr Ser Gin Cys Ser Arg Ser Ser Pro Val Val Pro 
175 180 185 190 

gtt gag caa gat gac act tct gca tea get etc act tgt gtc aac aac 1045 
Val Glu Gin Asp Asp Thr Ser Ala Ser Ala Leu Thr Cys Val Asn Asn 
195 200 205 

cct gat gac gtc teg acc gtt get cca act get cca act cca aat gtt 1093 
Pro Asp Asp Val Ser Thr Val Ala Pro Thr Ala Pro Thr Pro Asn Val 
210 215 220 

cct get ggt gga aac aag gaa acg ttg ttc gat ttc gac ttt act aat 1141 
Pro Ala Gly Gly Asn Lys Glu Thr Leu Phe Asp Phe Asp Phe Thr Asn 
225 * 230 235 

eta cag ate cct gat ttt ggt ttc ttg gca gag gag caa caa gac eta 1189 
Leu Gin He Pro Asp Phe Gly Phe Leu Ala Glu Glu Gin Gin Asp Leu 
240 " 245 250 

gac ttc gat tgt ttc etc gcg gat gat cag ttt gat gat ttc ggc ttg 1237 
Asp Phe Asp Cys Phe Leu Ala Asp Asp Gin Phe Asp Asp Phe Gly Leu 
255 260 265 270 

ctt gat gac att caa gga ttc gaa gat aac ggt cca agt gcg tta cca 1285 
Leu Asp Asp lie Gin Gly Phe Glu Asp Asn Gly Pro Ser Ala Leu Pro 
275 280 285 

gat ttc gac ttt gcg gat gtt gaa gat ctt cag eta get gac tct agt 1333 
Asp Phe Asp Phe Ala Asp Val Glu Asp Leu Gin Leu Ala Asp Ser Ser 
290 295 300 

ttc ggt ttc ctt gat caa ctt get cct ate aac ate tct tgc cca tta 1381 
Phe Gly Phe Leu Asp Gin Leu Ala Pro He Asn He Ser Cys Pro Leu 
305 310 315 

aaa agt ttt gca get tea tag gatqttgctt agtaatgtta agtgagaaga 1432 
Lys Ser Phe Ala Ala Ser 
320 

gtgttttgtt ttttcgttta tgctttagta atttaagaca tacaaaagtg tgtgttccgg 14 92 

attgtagtaa gatcttaaga cataaagecg ggttttgcaa ttaggaatcg agttttaatg 1552 

aagttttagt ttatgtttg 1571 

<210> 18 
<2H> 324 
<212> PRT 
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<213> Arabidopsis thaliana 
<400> 18 

Met Val Ala lie Arg LyB Glu Gin Ser Leu Ser Gly Val Ser Ser Glu 
15 10 15 

lie Lys Lys Arg Ala Lys Arg Asn Thr Leu Ser Ser Leu Pro Gin Glu 
20 * 25 30 

Thr Gin Pro Leu Arg Lys Val Arg lie lie Val Asn Asp Pro Tyr Ala 
35 ~ 40 45 

Thr Asp Asp Ser Ser Ser Asp Glu Glu Glu Leu Lys Val Pro Lys Pro 
50 55 60 

Arg Lys Met Lys Arg He Val Arg Glu He Asn Phe Pro Ser Met Glu 
65 70 75 80 

Val Ser Glu Gin Pro Ser Glu Ser Ser Ser Gin Asp Ser Thr Lys Thr 
85 90 95 

Asp Gly Lys He Ala Val Ser Ala Ser Pro Ala Val Pro Arg Lys Lys 
100 105 HO 

Pro Val Gly Val Arg Gin Arg Lys Trp Gly Lys Trp Ala Ala Glu He 
115 120 125 

Arg Asp Pro He Lys Lys Thr Arg Thr Trp Leu Gly Thr Phe Asp Thr 
130 135 140 

Leu Glu Glu Ala Ala Lys Ala Tyr Asp Ala Lys Lys Leu Glu Phe Asp 
145 150 155 160 

Ala He Val Ala Gly Asn Val Ser Thr Thr Lys Arg Asp Val Ser Ser 
165 170 175 

Ser Glu Thr Ser Gin Cys Ser Arg Ser Ser Pro Val Val Pro Val Glu 
180 185 190 

Gin Asp Asp Thr Ser Ala Ser Ala Leu Thr Cys Val Asn Asn Pro Asp 
195 200 205 

Asp Val Ser Thr Val Ala Pro Thr Ala Pro Thr Pro Asn Val Pro Ala 
210 215 220 

Gly Gly Asn Lys Glu Thr Leu Phe Asp Phe Asp Phe Thr Asn Leu Gin 
225 ' 230 235 240 

He Pro Asp Phe Gly Phe Leu Ala Glu Glu Gin Gin Asp Leu Asp Phe 
245 250 255 

Asp Cys Phe Leu Ala Asp Asp Gin Phe Asp Asp Phe Gly Leu Leu Asp 
260 * 265 270 

Asp He Gin Gly Phe Glu Asp Asn Gly Pro Ser Ala Leu Pro Asp Phe 
275 280 285 
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Asp Phe Ala Asp Val Glu Asp Leu Gin Leu Ala Asp Ser Ser Phe Gly 
290 295 300 

Phe Leu Asp Gin Leu Ala Pro He Asn He Ser Cys Pro Leu Lys Ser 
305 310 315 320 



Phe Ala Ala Ser 



<210> 19 

<211> 1322 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (104) . . (1084) 

<223> G1133 

<400> 19 

ttcaagaaag aatcaccaag tgttgcgttc cacacatttg agcaacagct tccacaatcg 60 

tattgtattc ctgtaaagtt cccttggctt aaactgcaag age atg cct ctt gat 115 

Met Pro Leu Asp 
1 

acc aaa cag cag aaa tgg ttg cca tta ggc tta aat cct caa get tgt 163 
Thr Lys Gin Gin Lys Trp Leu Pro Leu Gly Leu Asn Pro Gin Ala Cys 
5 10 15 20 

gtc cag gac aag gcg act gag tat ttc cgt cct gga att cct ttt ccg 211 
Val Gin Asp Lys Ala Thr Glu Tyr Phe Arg Pro Gly He Pro Phe Pro 
25 30 35 

gaa etc ggt aaa gtt tat gca get gag cat cag ttt cgc tat ttg cag 259 
Glu Leu Gly Lys Val Tyr Ala Ala Glu His Gin Phe Arg Tyr Leu Gin 
40 45 50 

cca ccg ttc caa gee tta ttg tct aga tat gat cag cag tct tgt gga 307 
Pro Pro Phe Gin Ala Leu Leu Ser Arg Tyr Asp Gin Gin Ser Cys Gly 
55 60 65 

aaa caa gtt tea tgt ttg aat ggg cga tct age aac ggt get get cca 355 
Lys Gin Val Ser Cys Leu Asn Gly Arg Ser Ser Asn Gly Ala Ala Pro 
70 75 80 



gag ggg gca etc aag tct tct egg aaa aga ttt ata gta ttc gat cag 
Glu Gly Ala Leu Lys Ser Ser Arg Lys Arg Phe He Val Phe Asp Gin 
85 90 95 100 



gac act gag gaa ate aac gcg tta ctg tat tct gat gat gac gat aat 
Asp Thr Glu Glu He Asn Ala Leu Leu Tyr Ser Asp Asp Asp Asp Asn 
165 170 175 180 



403 



teg gga gag cag act cgt ttg tta caa tgt gga ttt cct ctg egg ttt 451 
Ser Gly Glu Gin Thr Arg Leu Leu Gin Cys Gly Phe Pro Leu Arg Phe 
105 HO 115 

cct tct tct atg gat gca gag cga ggg aac att etc ggt gee eta cac 499 
Pro Ser Ser Met Asp Ala Glu Arg Gly Asn He Leu Gly Ala Leu His 
120 .125 130 

cca gag aaa ggg ttt agt aaa gat cat gee att caa gaa aag ata ttg 547 
Pro Glu Lys Gly Phe Ser Lys Asp His Ala He Gin Glu Lys He Leu 
135 140 145 

caa cat gaa gat cat gaa aat ggc gaa gaa gac teg gaa atg cac gaa 595 
Gin His Glu Asp His Glu Asn Gly Glu Glu Asp Ser Glu Met His Glu 
150 155 160 



643 
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gat gat tgg gaa agt gat gat gaa gta atg age act ggt cac tct cca 691 
Asp Asp Trp Glu Ser Asp Asp Glu Val Met Ser Thr Gly His Ser Pro 
185 190 195 

ttc aca gtt gaa caa caa gcg tgc aac ata aca aca gaa gag ctg gat 739 
Phe Thr Val Glu Gin Gin Ala Cys Asn lie Thr Thr Glu Glu Leu Asp 
200 205 210 

gaa act gaa age act gtt gat ggt cca ctt ctt aaa aga cag aaa eta 787 
Glu Thr Glu Ser Thr Val Asp Gly Pro Leu Leu Lys Arg Gin Lys Leu 
215 220 225 

ctg gac cat teg tac aga gac tea tea cca tec ctt gtg ggc acc act 835 
Leu Asp His Ser Tyr Arg Asp Ser Ser Pro Ser Leu Val Gly Thr Thr 
230 235 240 

aaa gtc aaa ggc tta tea gat gaa aac ctt cct gaa tea aac att tea 883 
Lys Val Lys Gly Leu Ser Asp Glu Asn Leu Pro Glu Ser Asn lie Ser 
245 250 255 260 

age aaa caa gaa acg ggt tct ggt ttg age gac gag cag tea aga aaa 931 
Ser Lys Gin Glu Thr Gly Ser Gly Leu Ser Asp Glu Gin Ser Arg Lys 
265 270 275 

gac aag att cac acc get ctg aga ate ctg gag agt gta gtt cca ggg 979 
Asp Lys He His Thr Ala Leu Arg He Leu Glu Ser Val Val Pro Gly 
280 285 290 

gca aag gga aaa gaa get ctt tta eta eta gac gaa gec att gat tac 1027 
Ala Lys Gly Lys Glu Ala Leu Leu Leu Leu Asp Glu Ala He Asp Tyr 
295 300 305 

etc aag ttg ctg aag caa age tta aac tea tea aag ggt ttg aat aac 1075 
Leu Lys Leu Leu Lys Gin Ser Leu Asn Ser Ser Lys Gly Leu Asn Asn 
310 315 320 

cat tgg tga aaaacctaca accccttttg tcctattgat aaggcatgtt 1124 

His Trp 

325 

tgg t tgg tta aagagaagac atgggacaaa agataatcaa tgaggtaaag gactgatgaa 1184 

gaagattctc tcaaattcat taacgtgggt ttgaaacaat tagaacaege ctggtgaccc 1244 

tagtgggacc gtatccactg ttcatctagc tggatcaata gtggtttact tttggatttg 1304 

gcatgctctc tcaaaaaa 1322 

<210> 20 

<211> 326 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 20 

Met Pro Leu Asp Thr Lys Gin Gin Lys Trp Leu Pro Leu Gly Leu Asn 
1 5 10 15 

Pro Gin Ala Cys Val Gin Asp Lys Ala Thr Glu Tyr Phe Arg Pro Gly 
20 25 30 

He Pro Phe Pro Glu Leu Gly Lys Val Tyr Ala Ala Glu His Gin Phe 
35 40 45 

Arg Tyr Leu Gin Pro Pro Phe Gin Ala Leu Leu Ser Arg Tyr Asp Gin 
50 55 60 

Gin Ser Cys Gly Lys Gin Val Ser Cys Leu Asn Gly Arg Ser Ser Asn 
65 70 75 80 
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Gly Ala Ala Pro Glu Gly Ala Leu Lye Ser Ser Arg Lys Arg Phe He 
85 90 95 

Val Phe Asp Gin Ser Gly Glu Gin Thr Arg Leu Leu Gin Cys Gly Phe 
100 105 110 

Pro Leu Arg Phe Pro Ser Ser Met Asp Ala Glu Arg Gly Asn He Leu 
115 120 125 

Gly Ala Leu His Pro Glu Lys Gly Phe Ser Lys Asp His Ala He Gin 
130 135 140 

Glu Lys He Leu Gin His Glu Asp His Glu Asn Gly Glu Glu Asp Ser 
145 150 155 160 



Glu Met HiB Glu Asp Thr Glu Glu He Asn Ala Leu Leu Tyr Ser Asp 
165 170 175 



Asp Asp Asp Asn Asp Asp Trp Glu Ser Asp Asp Glu Val Met Ser Thr 
180 185 190 

Gly His Ser Pro Phe Thr Val Glu Gin Gin Ala Cys Asn He Thr Thr 
195 200 205 



Glu Glu Leu Asp Glu Thr Glu Ser Thr Val Asp Gly Pro Leu Leu Lys 
210 215 220 



Arg Gin Lys Leu Leu Asp His Ser Tyr Arg Asp Ser Ser Pro Ser Leu 
225 230 235 240 

Val Gly Thr Thr Lys Val Lys Gly Leu Ser Asp Glu Asn Leu Pro Glu 
245 250 255 

Ser Asn He Ser Ser Lys Gin Glu Thr Gly Ser Gly Leu Ser Asp Glu 
260 265 270 



Gin Ser Arg Lys Asp Lys He His Thr Ala Leu Arg He Leu Glu Ser 
275 280 285 

Val Val Pro Gly Ala Lys Gly Lys Glu Ala Leu Leu Leu Leu Asp Glu 
290 295 300 

Ala He Asp Tyr Leu Lys Leu Leu Lys Gin Ser Leu Asn Ser Ser Lys 
305 ^ 310 315 320 



Gly Leu Asn Asn His Trp 
325 



<210> 21 

<211> 859 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (62).. (718) 

<223> GX266 
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<400> 21 

caatccacta acgatcccta accgaaaaca gagtagtcaa gaaacagagt attttttcta 60 

c atg gat cca ttt tta att cag tec cca ttc tec ggc ttc tea ccg gaa 109 
Met Asp Pro Phe Leu He Gin Ser Pro Phe Ser Gly Phe Ser Pro Glu 
15 10 15 

tat tct ate gga tct tct cca gat tct ttc tea tec tct tct tct aac 157 
Tyr Ser He Gly Ser Ser Pro Asp Ser Phe Ser Ser Ser Ser Ser Asn 
20 25 30 

aat tac tct ctt ccc ttc aac gag aac gac tea gag gaa atg ttt etc 205 
Asn Tyr Ser Leu Pro Phe Asn Glu Asn Asp Ser Glu Glu Met Phe Leu 
35 40 45 

tac ggt eta ate gag cag tec acg caa caa acc tat att gac teg gat 253 
Tyr Gly Leu He Glu Gin Ser Thr Gin Gin Thr Tyr He Asp Ser Asp 
50 55 60 

agt caa gac ctt ccg ate aaa tec gta age tea aga aag tea gag aag 301 
Ser Gin Asp Leu Pro He Lys Ser Val Ser Ser Arg Lys Ser Glu Lys 
65 70 75 80 

tct tac aga ggc gta aga cga egg cca tgg ggg aaa ttc gcg gcg gag 349 
Ser Tyr Arg Gly Val Arg Arg Arg Pro Trp Gly Lys Phe Ala Ala Glu 
85 ~ 90 95 

ata aga gat teg act aga aac ggt att agg gtt tgg etc ggg acg ttc 397 
He Arg Asp Ser Thr Arg Asn Gly He Arg Val Trp Leu Gly Thr Phe 
100 105 110 

gaa age gcg gaa gag gcg get tta gee tac gat caa get get ttc teg 445 
Glu Ser Ala Glu Glu Ala Ala Leu Ala Tyr Asp Gin Ala Ala Phe Ser 
115 120 125 

atg aga ggg tec teg gcg att etc aat ttt teg gcg gag aga gtt caa 493 
Met Arg Gly Ser Ser Ala He Leu Asn Phe Ser Ala Glu Arg Val Gin 
130 135 140 

gag teg ctt teg gag att aaa tat acc tac gag gat ggt tgt tct ccg 541 
Glu Ser Leu Ser Glu He Lys Tyr Thr Tyr Glu Asp Gly Cys Ser Pro 
145 150 155 160 

gtt gtg gcg ttg aag agg aaa cac teg atg aga egg aga atg acc aat 589 
Val Val Ala Leu Lys Arg Lys His Ser Met Arg Arg Arg Met Thr Asn 
165 170 175 

aag aag acg aaa gat agt gac ttt gat cac cgc tec gtg aag tta gat 637 
Lys Lys Thr Lys Asp Ser Asp Phe Asp His Arg Ser Val Lys Leu Asp 
180 185 190 

aat gta gtt gtc ttt gag gat ttg gga gaa cag tac ctt gag gag ctt 685 
Asn Val Val Val Phe Glu Asp Leu Gly Glu Gin Tyr Leu Glu Glu Leu 
195 200 205 

ttg ggg tct tct gaa aat agt ggg act tgg tga aagattagga tttgtattag 738 
Leu Gly Ser Ser Glu Asn Ser Gly Thr Trp 
210 215 

ggaccttaag tttgaagtgg ttgattaatt ttaaccctaa tatgtttttt gtttgcttaa 798 

atatttgatt ctattgagaa acatcgaaaa. cagtttgtat gtacttttgt gatacttggc 858 

g 859 

<210> 22 
<211> 218 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 22 

Met Asp Pro Phe Leu He Gin Ser Pro Phe Ser Gly Phe Ser Pro Glu 
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15 10 15 

Tyr Ser He Gly Ser Ser Pro Asp Ser Phe Ser Ser Ser Ser Ser Asn 
20 25 30 

Asn Tyr Ser Leu Pro Phe Asn Glu Asn Asp Ser Glu Glu Met Phe Leu 
35 40 45 

Tyr Gly Leu He Glu Gin Ser Thr Gin Gin Thr Tyr lie Asp Ser Asp 
50 55 60 

Ser Gin Asp Leu Pro He Lys Ser Val Ser Ser Arg Lys Ser Glu Lys 
65 70 75 80 

Ser Tyr Arg Gly Val Arg Arg Arg Pro Trp Gly Lys Phe Ala Ala Glu 
85 90 95 

He Arg Asp Ser Thr Arg Asn Gly He Arg Val Trp Leu Gly Thr Phe 
100 105 HO 

Glu Ser Ala Glu Glu Ala Ala Leu Ala Tyr Asp Gin Ala Ala Phe Ser 
115 120 125 

Met Arg Gly Ser Ser Ala lie Leu Asn Phe Ser Ala Glu Arg Val Gin 
130 135 140 

Glu Ser Leu Ser Glu He Lys Tyr Thr Tyr Glu Asp Gly Cys Ser Pro 
145 150 155 160 

Val Val Ala Leu Lys Arg Lys His Ser Met Arg Arg Arg Met Thr Asn 
165 170 175 

Lys Lys Thr Lys Asp Ser Asp Phe Asp His Arg Ser Val Lys Leu Asp 
180 185 190 

Asn Val Val Val Phe Glu Asp Leu Gly Glu Gin Tyr Leu Glu Glu Leu 
195 200 205 

Leu Gly Ser Ser Glu Asn Ser Gly Thr Trp 
210 215 

<210> 23 

<211> 1137 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (54).. (914) 

<223> G1324 

<400> 23 

cgaaaacacc acaaaccaaa tatcattaag taattaggaa acttaaacta agt atg 56 

Met 
1 

gaa aat teg atg aag aag aag aag age ttc aaa gaa agt gaa gat gaa 104 
Glu Asn Ser Met Lys Lys Lys Lys Ser Phe Lys Glu Ser Glu Asp Glu 
5 10 15 
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gaa eta aga aga ggg cct tgg act ttg gag gaa gac aca ctt etc aca 152 
Glu Leu Arg Arg Gly Pro Trp Thr Leu Glu Glu Asp Thr Leu Leu Thr 
20 25 30 

aat tac ate etc cat aac ggt gag ggt cgt tgg aat cac gtc gee aaa 200 
Asn Tyr He Leu His Asn Gly Glu Gly Arg Trp Asn His Val Ala Lys 
35 40 45 

tgt get ggg eta aag aga act ggg aaa agt tgt aga ttg aga tgg ttg 24 8 

Cys Ala Gly Leu Lys Arg Thr Gly LyB Ser Cys Arg Leu Arg Trp Leu 
50 * 55 60 65 

aat tac ttg aaa ccc gac ata aga cga ggg aat ctt act cct caa gaa 296 
Asn Tyr Leu Lys Pro Asp He Arg Arg Gly Asn Leu Thr Pro Gin Glu 
70 75 80 

cag ctt ttg ate ctt gag ctt cac tct aaa tgg ggt aat agg tgg tec 344 
Gin Leu Leu He Leu Glu Leu His Ser Lys Trp Gly Asn Arg Trp Ser 
85 90 95 

aag att gca cag tac ttg cca gga aga acg gat aac gag ate aag aac 392 
Lys He Ala Gin Tyr Leu Pro Gly Arg Thr Asp Asn Glu He Lys Asn 
100 - 105 110 

tat tgg aga aca aga gtt caa aaa caa get cgt caa etc aac ate gaa 440 
Tyr Trp Arg Thr Arg Val Gin Lys Gin Ala Arg Gin Leu Asn He Glu 
115 120 125 

tct aac age gac aag ttc ttt gac get gtt cgt agt ttt tgg gtc cct 48 8 

Ser Asn Ser Asp Lys Phe Phe Asp Ala Val Arg Ser Phe Trp Val Pro 
130 135 140 " 145 

aga ttg ate gag aag atg gaa caa aac tea tec act act act act tat 536 
Arg Leu He Glu Lys Met Glu Gin Asn Ser Ser Thr Thr Thr Thr Tyr 
150 155 160 

tgt tgt ccc caa aac aac aac aac aac tct ctt ctt ctt cct tct caa 584 
Cys Cys Pro Gin Asn Asn Asn Asn Asn Ser Leu Leu Leu Pro Ser Gin 
165 170 175 

tct cac gac tct tta agt atg caa aaa gat ata gat tac teg ggt ttc 632 
Ser His Asp Ser Leu Ser Met Gin LyB Asp He Asp Tyr Ser Gly Phe 
180 185 190 

age aac ata gac ggt tct tct tea act tct act tgc atg tct cat eta 680 
Ser Asn He Asp Gly Ser Ser Ser Thr Ser Thr Cys Met Ser His Leu 
195 200 205 

aca aca gtt cca cac ttt atg gat caa age aac ace aat ate ate gat 728 
Thr Thr Val Pro His Phe Met Asp Gin Ser Asn Thr Asn He He Asp 
210 215 220 225 

ggc teg atg tgt ttc cat gaa ggc aat gtt caa gaa ttc gga gga tat 776 
Gly Ser Met Cys Phe His Glu Gly Asn Val Gin Glu Phe Gly Gly Tyr 
230 235 240 

gtt cct ggc atg gag gat tac atg gta aac teg gac ate tea atg gaa 824 
Val Pro Gly Met Glu Asp Tyr Met Val Asn Ser Asp He Ser Met Glu 
245 250 255 

tgt cac gtg gcg gat ggt tat tea gcg tac gag gat gtt aca caa gat 872 
Cys His Val Ala Asp Gly Tyr Ser Ala Tyr Glu Asp Val Thr Gin Asp 
260 265 . 270 

ccc atg tgg aat gtg gat gac att tgg cag ttt agg gag taa 914 
Pro Met Trp Asn Val Asp Asp He Trp Gin Phe Arg Glu 
275 280 285 

ttaagtcgtc aagagatgag atggtagagc ctaccactac ggttctatta tatggactaa 974 

tatacttctt ttgettaact aagcaaaaag tttcgaacct tttacccata ttatctcggg 1034 

ttggagacta gaacatgtta aatttgtatc ttctttgttg cgagtactta ctaagtcatt 1094 

ggataaatat ttataatgat agtttcttgt acaaaaaaaa aaa 1137 
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<210> 24 
<211> 286 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 24 

Met Glu Asn Ser Met Lys Lys Lys Lys Ser Phe Lys Glu Ser Glu Asp 
15 10 15 



Glu Glu Leu Arg Arg Gly Pro Trp Thr Leu Glu Glu Asp Thr Leu Leu 
20 25 30 

Thr Asn Tyr lie Leu His Asn Gly Glu Gly Arg Trp Asn His Val Ala 
35 40 45 



Lys Cye Ala Gly Leu Lys Arg Thr Gly Lys Ser Cys Arg Leu Arg Trp 
50 55 60 

Leu Asn Tyr Leu Lys Pro Asp lie Arg Arg Gly Asn Leu Thr Pro Gin 
65 70 75 80 



Glu Gin Leu Leu He Leu Glu Leu His Ser Lys Trp Gly Asn Arg Trp 
85 90 95 



Ser Lys He Ala Gin Tyr Leu Pro Gly Arg Thr Asp Asn Glu He Lys 
100 105 110 



Asn Tyr Trp Arg Thr Arg Val Gin Lys Gin Ala Arg Gin Leu Asn He 
115 120 125 



Glu Ser Asn Ser Asp Lys Phe Phe Asp Ala Val Arg Ser Phe Trp Val 

130 135 140 

Pro Arg Leu He Glu Lys Met Glu Gin Asn Ser Ser Thr Thr Thr Thr 
145 " 150 155 160 



Tyr Cys Cys Pro Gin Asn Asn Asn Asn Asn Ser Leu Leu Leu Pro Ser 
165 170 175 

Gin Ser His Asp Ser Leu Ser Met Gin Lys Asp He Asp Tyr Ser Gly 
180 185 190 

Phe Ser Asn He Asp Gly Ser Ser Ser Thr Ser Thr Cys Met Ser His 
195 200 205 



Leu Thr Thr Val Pro His Phe Met Asp Gin Ser Asn Thr Asn He He 
210 215 220 

Asp Gly Ser Met Cys Phe His Glu Gly Asn Val Gin Glu Phe Gly Gly 
225 230 235 240 



Tyr val Pro Gly Met Glu Asp Tyr Met Val Asn Ser Asp He Ser Met 
245 250 255 

Glu Cys His Val Ala Asp Gly Tyr Ser Ala Tyr Glu Asp Val Thr Gin 
260 265 270 
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Asp Pro Met Trp Asn Val Asp Asp He Trp Gin Phe Arg Glu 
275 280 285 

<210> 25 

<211> 1630 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (97) . . (1398) 

<223> G1337 

<400> 25 

aatggatttg tcatcattct tctcaccgtc cttagtctct gaaaataaat tctgattttg 60 

atttcgaatt ttagggattt tgagagagag tcagtt atg agt agt teg gag aga 114 

Met Ser Ser Ser Glu Arg 
1 5 

gta ccg tgc gat ttc tgc ggc gag cgt acg gcg gtt ttg ttt tgt aga 162 
Val Pro Cys Asp Phe Cys Gly Glu Arg Thr Ala Val Leu Phe Cys Arg 
10 15 20 

gec gat acg gcg aag ctg tgt ttg cct tgt gat cag caa gtt cac acg 210 
Ala Asp Thr Ala Lye Leu Cys Leu Pro Cys Asp Gin Gin Val His Thr 
25 30 35 

gcg aat ctg ttg teg agg aag cac gtg cga tct cag ate tgc gat aat 258 
Ala Asn Leu Leu Ser Arg Lys His Val Arg Ser Gin He Cys Asp Asn 
40 45 50 

tgc ggt aac gag cca gtc tct gtt egg tgt ttc acc gat aat ctg att 306 
Cys Gly Asn Glu Pro Val Ser Val Arg Cys Phe Thr Asp Asn Leu He 
55 60 65 70 

ttg tgt cag gag tgt gat tgg gat gtt cac gga agt tgt tea gtt tec 354 
Leu Cys Gin Glu Cys Asp Trp Asp Val His Gly Ser Cys Ser Val Ser 
75 80 85 

gat get cat gtt cga tec gec gtg gaa ggt ttt tec ggt tgt cca teg 402 
Asp Ala His Val Arg Ser Ala Val Glu Gly Phe Ser Gly Cys Pro Ser 
90 95 100 

gcg ttg gag ctt get get tta tgg gga ctt gat ttg gag caa ggg agg 4 50 

Ala Leu Glu Leu Ala Ala Leu Trp Gly Leu Asp Leu Glu Gin Gly Arg 
105 110 115 

aaa gat gaa gag aat caa gtt ccg atg atg gcg atg atg atg gat aat 498 
Lys Asp Glu Glu Asn Gin Val Pro Met Met Ala Met Met Met Asp Asn 
120 125 130 

ttc ggg atg cag ttg gat tct tgg gtt ttg gga tct aat gaa ttg att 546 
Phe Gly Met Gin Leu Asp Ser Trp Val Leu Gly Ser Asn Glu Leu He 
135 140 145 150 

gtt ccc age gat acg acg ttt aag aag cgt gga tct tgt gga tct agt 594 
Val Pro Ser Asp Thr Thr Phe Lys Lys Arg Gly Ser Cys Gly Ser Ser 
155 160 " 165 

tgt ggg agg tat aag cag gta ttg tgt aag cag ctt gag gag ttg ctt 642 
Cys Gly Arg Tyr Lys Gin Val Leu Cys Lys Gin Leu Glu Glu Leu Leu 
170 175 180 

aag agt ggt gtt gtc ggt ggt gat ggc gat gat ggt gat cgt gac cgt 690 
Lys Ser Gly Val Val Gly Gly Asp Gly Asp Asp Gly Asp Arg Asp Arg 
185 190 195 

gat tgt gac cgt gag ggt get tgt gat gga gat gga gat gga gaa gca 738 
Asp Cys Asp Arg Glu Gly Ala Cys Asp Gly Asp Gly Asp Gly Glu Ala 
200 205 210 
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gga gag ggg ctt atg gtt ccg gag atg tea gag aga ttg aaa tgg tea 786 
Gly Glu Gly Leu Met Val Pro Glu Met Ser Glu Arg Leu Lys Trp Ser 
215 220 225 230 

aga gat gtt gag gag ate aat ggt ggc gga gga gga gga gtt aac cag 834 
Arg Asp Val Glu Glu lie Asn Gly Gly Gly Gly Gly Gly Val Asn Gin 
235 240 245 

cag tgg aat get act act act aat cct agt ggt ggc cag agt tct cag 882 
Gin Trp Asn Ala Thr Thr Thr Asn Pro Ser Gly Gly Gin Ser Ser Gin 
250 255 260 

ata tgg gat ttt aac ttg gga cag tea egg gga cct gag gat acg agt 930 
lie Trp Asp Phe Asn Leu Gly Gin Ser Arg Gly Pro Glu Asp Thr Ser 
265 270 275 

cga gtg gaa get gca tat gta ggg aaa ggt get get tct tea ttc aca 978 
Arg Val Glu Ala Ala Tyr Val Gly Lys Gly Ala Ala Ser Ser Phe Thr 
280 285 290 

ate aac aat ttt gtt gac cat atg aat gaa act tgt tec act aat gtg 1026 
lie Asn Asn Phe Val Asp His Met Asn Glu Thr Cys Ser Thr Asn Val 
295 300 305 310 

aaa ggt gtc aaa gag att aaa aag gat gac tac aag cga tea act tea 1074 
Lys Gly Val Lys Glu lie Lys Lys Asp Asp Tyr Lys Arg Ser Thr Ser 
315 320 325 

ggc cag gta caa cca aca aaa tct gag age aac aat cgt cca att acc 1122 
Gly Gin Val Gin Pro Thr Lys Ser Glu Ser Asn Asn Arg Pro He Thr 
330 335 340 

ttt ggc tct gag aaa ggt teg aac tec tec agt gac ttg cat ttc aca 1170 
Phe Gly Ser Glu Lys Gly Ser Asn Ser Ser Ser Asp Leu His Phe Thr 
345 350 355 

gag cat att get gga act agt tgt aag acc aca aga eta gtt gca act 1218 
Glu His He Ala Gly Thr Ser Cys Lys Thr Thr Arg Leu Val Ala Thr 
360 365 370 

aag get gat ctg gag egg ctg get cag aac aga gga gat gca atg cag 1266 
Lys Ala Asp Leu Glu Arg Leu Ala Gin Asn Arg Gly Asp Ala Met Gin 
375 380 385 390 

cgt tac aag gaa aag agg aag aca egg aga tat gat aag acc ata agg 1314 
Arg Tyr Lys Glu Lys Arg Lys Thr Arg Arg Tyr Asp Lys Thr He Arg 
395 400 • 405 

tat gaa teg agg aag gca aga get gac act agg ttg cgt gtc aga ggc 1362 
Tyr Glu Ser Arg Lys Ala Arg Ala Asp Thr Arg Leu Arg Val Arg Gly 
410 415 420 

aga ttt gtg aaa get agt gaa get cct tac cct taa ccttaagttt 1408 
Arg Phe Val Lys Ala Ser Glu Ala Pro Tyr Pro 
425 430 

tttcacatag gcttcctttt agctacaaac ttagttactt tttttactcc actgcctcat 1468 

aaatgtacag accggtctcg tttcatctgg ccgcccttct tgttttattg ccttatctgg 1528 

cccttttatg taccttggaa tcttatctag tttaaaaaag attgtaacct tctagaaaae 1588 

catattctgt tgacagtata tacatgtcta tccaagcaaa aa 1630 

<210> 26 

<211> 433 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 26 

Met Ser Ser Ser Glu Arg Val Pro Cys Asp Phe Cys Gly Glu Arg Thr 
1 5 10 -15 

Page 37 



WO 01/36597 



PCT/US00/31344 



MBI-20 Sequence Listing.ST25 

Ala Val Leu Phe Cys Arg Ala Asp Thr Ala Lys Leu Cys Leu Pro Cys 
20 " 25 30 



Asp Gin Gin Val His Thr Ala Asn Leu Leu Ser Arg Lys His Val Arg 
35 40 45 

Ser Gin lie Cys Asp Asn Cys Gly Asn Glu Pro Val Ser Val Arg Cys 
50 55 60 



Phe Thr Asp Asn Leu lie Leu Cys Gin Glu Cys Asp Trp Asp Val His 
65 70 75 80 



Gly Ser Cys Ser Val Ser Asp Ala His Val Arg Ser Ala Val Glu Gly 
85 90 95 



Phe Ser Gly Cys Pro Ser Ala Leu Glu Leu Ala Ala Leu Trp Gly Leu 
100 105 110 



Asp Leu Glu Gin Gly Arg Lys Asp Glu Glu Asn Gin Val Pro Met Met 
115 ^ 120 125 



Ala Met Met Met Asp Asn Phe Gly Met Gin Leu Asp Ser Trp Val Leu 
130 135 140 



Gly Ser Asn Glu Leu He Val Pro Ser Asp Thr Thr Phe Lys Lys Arg 
145 150 155 160 



Gly Ser Cys Gly Ser Ser Cys Gly Arg Tyr Lys Gin Val Leu Cys Lys 
165 170 175 



Gin Leu Glu Glu Leu Leu Lys Ser Gly Val Val Gly Gly Asp Gly Asp 
180 185 190 



Asp Gly Asp Arg Asp Arg Asp Cys Asp Arg Glu Gly Ala Cys Asp Gly 
195 200 205 



Asp Gly Asp Gly Glu Ala Gly Glu Gly Leu Met Val Pro Glu Met Ser 
210 215 -220 



Glu Arg Leu Lys Trp Ser Arg Asp Val Glu Glu He Asn Gly Gly Gly 
225 230 235 240 



Gly Gly Gly Val Asn Gin Gin Trp Asn Ala Thr Thr Thr Asn Pro Ser 
245 250 255 

Gly Gly Gin Ser Ser Gin He Trp Asp Phe Asn Leu Gly Gin Ser Arg 
260 265 270 



Gly Pro Glu Asp Thr Ser Arg Val Glu Ala Ala Tyr Val Gly Lys Gly 
275 280 285 



Ala Ala Ser Ser Phe Thr He Asn Asn Phe Val Asp His Met Asn Glu 
290 295 300 



Thr Cys Ser Thr Asn Val Lys Gly Val Lys Glu He Lys Lys Asp Asp 
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305 310 315 320 

Tyr Lys Arg Ser Thr Ser Gly Gin Val Gin Pro Thr Lys Ser Glu Ser 
325 330 335 

Asn Asn Arg Pro He Thr Phe Gly Ser Glu Lys Gly Ser Asn Ser Ser 
340 345 350 

Ser Asp Leu Hie Phe Thr Glu His He Ala Gly Thr Ser Cys Lys Thr 
355 360 365 

Thr Arg Leu Val Ala Thr Lys Ala Asp Leu Glu Arg Leu Ala Gin Asn 
370 375 380 

Arg Gly Asp Ala Met Gin Arg Tyr Lys Glu Lys Arg Lys Thr Arg Arg 
385 390 395 400 

Tyr Asp Lys Thr lie Arg Tyr Glu Ser Arg Lys Ala Arg Ala Asp Thr 
405 410 415 

Arg Leu Arg Val Arg Gly Arg Phe Val Lys Ala Ser Glu Ala Pro Tyr 
420 " 425 430 



Pro 



<210> 27 

<211> 768 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (58).. (657) 

<223> G975 

<400> 27 

attactcatc atcaagttcc tactttctct ctgacaaaca tcacagagta agtaaga 57 

atg gta cag acg aag aag ttc aga ggt gtc agg caa cgc cat tgg ggt 105 
Met val Gin Thr Lys Lys Phe Arg Gly Val Arg Gin Arg His Trp Gly 
15 10 15 

tct tgg gtc get gag att cgt cat cct etc ttg aaa egg agg att tgg 153 
Ser Trp Val Ala Glu He Arg His Pro Leu Leu Lys Arg Arg He Trp 
20 25 30 

eta ggg acg ttc gag acc gca gag gag gca gca aga gca tac gac gag 201 
Leu Gly Thr Phe Glu Thr Ala Glu Glu Ala Ala Arg Ala Tyr Asp Glu 
35 40 45 

gec gec gtt tta atg age ggc cgc aac gec aaa acc aac ttt ccc etc 24 9 

Ala Ala Val Leu Met Ser Gly Arg Asn Ala Lys Thr Asn Phe Pro Leu 
50 55 60 

aac aac aac aac acc gga gaa act tec gag ggc aaa acc gat att tea 297 
Asn Asn Asn Asn Thr Gly Glu Thr Ser Glu Gly Lys Thr Asp He Ser 
65 70 75 80 

get teg tec aca atg tea tec tea aca tea tct tea teg etc tct tec 345 
Ala Ser Ser Thr Met Ser Ser Ser Thr Ser Ser Ser Ser Leu Ser Ser 
85 90 95 

ate etc age gee aaa ctg agg aaa tgc tgc aag tct cct tec cca tec 393 
He Leu Ser Ala Lye Leu Arg Lys Cys Cys Lys Ser Pro Ser Pro Ser 
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100 105 110 

etc acc tgc etc cgt ctt gac aca gec age tec cat ate ggc gtc tgg 441 
Leu Thr Cys Leu Arg Leu Asp Thr Ala Ser Ser His He Gly Val Trp 
115 120 125 

cag aaa egg gee ggt tea aag tct gac tec age tgg gtc atg acg gtg 489 
Gin Lys Arg Ala Gly Ser Lys Ser Asp Ser Ser Trp Val Met Thr Val 
130 135 140 

gag eta ggt ccc gca age tec tec caa gag act act agt aaa get tea 537 
Glu Leu Gly Pro Ala Ser Ser Ser Gin Glu Thr Thr Ser Lys Ala Ser 
145 150 155 160 

caa gac get att ctt get ccg acc act gaa gtt gaa att ggt ggc age 585 
Gin Asp Ala He Leu Ala Pro Thr Thr Glu Val Glu He Gly Gly Ser 
165 170 175 

aga gaa gaa gta ttg gat gag gaa gaa aag gtt get ttg caa atg ata 633 
Arg Glu Glu Val Leu Asp Glu Glu Glu Lys Val Ala Leu Gin Met He 
180 185 190 

gag gag ctt etc aat aca aac taa atcttatttg cttatatata tgtacctatt 687 
Glu Glu Leu Leu Asn Thr Asn 
195 

ttcattgctg atttacagee aaaataatca attataccgt gtattttata gatgttttat 747 
attaaaaggt tgttagatat a 768 

<210> 28 
<211> 199 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 28 

Met Val Gin Thr Lys Lys Phe Arg Gly Val Arg Gin Arg His Trp Gly 
15 10 15 

Ser Trp Val Ala Glu He Arg His Pro Leu Leu Lys Arg Arg He Trp 
20 25 30 

Leu Gly Thr Phe Glu Thr Ala Glu Glu Ala Ala Arg Ala Tyr Asp Glu 
35 40 45 

Ala Ala Val Leu Met Ser Gly Arg Asn Ala Lys Thr Asn Phe Pro Leu 
50 55 60 

Asn Asn Asn Asn Thr Gly Glu Thr Ser Glu Gly Lys Thr Asp He Ser 
65 70 75 80 

Ala Ser Ser Thr Met Ser Ser Ser Thr Ser Ser Ser Ser Leu Ser Ser 
85 90 95 

He Leu Ser Ala Lys Leu Arg Lys Cys Cys Lys Ser Pro Ser Pro Ser 
100 105 110 

Leu Thr Cys Leu Arg Leu Asp Thr Ala Ser Ser His lie Gly Val Trp 
115 ~ 120 125 

Gin Lys Arg Ala Gly Ser Lys Ser Asp Ser Ser Trp Val Met Thr Val 
130 135 140 



Glu Leu Gly Pro Ala Ser Ser Ser Gin Glu Thr Thr Ser Lys Ala Ser 
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145 150 155 160 

Gin Asp Ala He Leu Ala Pro Thr Thr Glu Val Glu He Gly Gly Ser 
165 170 175 

Arg Glu Glu Val Leu Asp Glu Glu Glu Lys Val Ala Leu Gin Met He 
180 185 190 



Glu Glu Leu Leu Asn Thr Asn 
195 



<210> 


29 


<211> 


2526 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(338) . . (2275) 


<223> 


G680 


<400> 


29 



aca aag act get gtt cag ate aga agt cat gca caa aag ttc ttc aca 
Thr Lys Thr Ala Val Gin He Arg Ser His Ala Gin Lys Phe Phe Thr 
55 60 65 70 



ttg gac ata gaa att ccg cct cct cgt cct aaa cga aaa ccc aat act 
Leu Asp He Glu He Pro Pro Pro "Arg Pro Lys Arg Lys Pro Asn Thr 
90 95 100 



_,_jt tttaaattta tttttagaga attttttttg 60 

ttttgettec gatttgatta tttccgggaa cgatgacttc teeggggagt teceggtgag 120 

atgataagtc agattgeata cttgtctcct ccatggctac tctcaagggt tttggctgcg 180 

gtggattcgt ttggtttctc tagaatctaa agaggttatc acaaeggett tgcaatttga 240 

aaactttcat gtttggggag atcaaagatg gtttcttttt tatactttac ttgttagaga 300 

ggatttgaag cagegaatag ctgcaaccgg tcctgtt atg gat act aat aca tct 355 

Met Asp Thr Asn Thr Ser 
1 5 

gga gaa gaa tta tta get aag gca aga aag cca tat aca ata aca aag 403 
Gly Glu Glu Leu Leu Ala Lys Ala Arg Lys Pro Tyr Thr He Thr Lys 
10 15 20 

cag cga gag cga tgg act gag gat gag cat gag agg ttt eta gaa gee 4 51 

Gin Arg Glu Arg Trp Thr Glu Asp Glu His Glu Arg Phe Leu Glu Ala 
25 * 30 35 

ttg agg ctt tat gga aga get tgg caa cga att gaa gaa cat att ggg 499 
Leu Arg Leu Tyr Gly Arg Ala Trp Gin Arg He Glu Glu His He Gly 
40 45 50 



547 



aag ttg gag aaa gag get gaa gtt aaa ggc ate cct gtt tgc caa get 595 
Lys Leu Glu Lys Glu Ala Glu Val Lys Gly He Pro Val Cye Gin Ala 
75 80 85 



643 



cct tat cct cga aaa cct ggg aac aac ggt aca tct tec tct caa gta 691 
Pro Tyr Pro Arg Lys Pro Gly Asn Asn Gly Thr Ser Ser Ser Gin Val 
105 110 115 

tea tea gca aaa gat gca aaa ctt gtt tea teg gec tct tct tea cag 739 
Ser Ser Ala Lys Asp Ala Lys Leu Val Ser Ser Ala Ser Ser Ser Gin 
120 * 125 130 

ttg aat cag gcg ttc ttg gat ttg gaa aaa atg ccg ttc tct gag aaa 787 
Leu Asn Gin Ala Phe Leu Asp Leu Glu Lys Met Pro Phe Ser Glu Lys 

Page 41 



WO 01/36597 



PCT/US00/31344 



135 



140 



MBI-20 Sequence Listing. ST25 
145 



150 



aca tea act gga aaa gaa aat caa gat gag aat tgc teg ggt gtt tct 
Thr Ser Thr Gly Lys Glu Asn Gin Asp Glu Asn Cys Ser Gly Val Ser 
155 160 165 



835 



act gtg.aac aag tat ccc tta cca acg aaa cag gta agt ggc gac att 
Thr Val Asn Lys Tyr Pro Leu Pro Thr Lys Gin Val Ser Gly Asp lie 
170 175 180 



883 



gaa aca agt aag ace tea act gtg gac aac gcg gtt caa gat gtt ccc 
Glu Thr Ser Lys Thr Ser Thr Val Asp Asn Ala Val Gin Asp Val Pro 
185 190 195 



931 



aag aag aac aaa gac aaa gat ggt aac gat ggt act act gtg cac age 
Lys Lys Asn Lys Asp Lys Asp Gly Asn Abp Gly Thr Thr Val His Ser 
200 205 210 



979 



atg caa aac tac cct tgg cat ttc cac gca gat att gtg aac ggg aat 
Met Gin Asn Tyr Pro Trp His Phe His Ala Asp lie Val Asn Gly Asn 
215 220 225 230 



1027 



ata gca aaa tgc cct caa aat cat ccc tea .ggt atg gta tct caa gac 
He Ala Lys Cys Pro Gin Asn His Pro Ser Gly Met Val Ser Gin Asp 
235 240 245 



1075 



ttc atg ttt cat cct atg aga gaa gaa act cac ggg cac gca aat ctt 
Phe Met Phe His Pro Met Arg Glu Glu Thr His Gly His Ala Asn Leu 
250 255 260 



1123 



caa get aca aca gca tct get act act aca get tct cat caa gcg ttt 
Gin Ala Thr Thr Ala Ser Ala Thr Thr Thr Ala Ser His Gin Ala Phe 
265 270 275 



1171 



cca get tgt cat tea cag gat gat tac cgt teg ttt etc cag ata tea 
Pro Ala Cys His Ser Gin Asp Asp Tyr Arg Ser Phe Leu Gin He Ser 
280 285 290 



1219 



tct act ttc tec aat ctt att atg tea act etc eta cag aat cct gca 
Ser Thr Phe Ser Asn Leu He Met Ser Thr Leu Leu Gin Asn Pro Ala 
295 300 305 310 



1267 



get cat get gca get aca ttc get get teg gtc tgg cct tat gcg agt 
Ala His Ala Ala Ala Thr Phe Ala Ala Ser Val Trp Pro Tyr Ala Ser 
315 320 ~ 325 



1315 



gtc ggg aat tct ggt gat tea tea ace cca atg age tct tct cct cca 
Val Gly Asn Ser Gly Asp Ser Ser Thr Pro Met Ser Ser Ser Pro Pro 
330 335 340 



1363 



agt ata act gee att gee get get aca gta get get gca act get tgg 
Ser He Thr Ala lie Ala Ala Ala Thr Val Ala Ala Ala Thr Ala Trp 
345 350 355 



1411 



tgg get tct cat gga ctt ctt cct gta tgc get cca get cca ata aca 
Trp Ala Ser His Gly Leu Leu Pro Val Cys Ala Pro Ala Pro He Thr 
360 365 370 



1459 



tgt gtt cca ttc tea act gtt gca gtt cca act cca gca atg act gaa 
Cys Val Pro Phe Ser Thr Val Ala Val Pro Thr Pro Ala Met Thr Glu 
375 380 385 390 



1507 



atg gat ace gtt gaa aat act caa ccg ttt gag aaa caa aac aca get 
Met Asp Thr Val Glu ABn Thr Gin Pro Phe Glu Lys Gin Asn Thr Ala 
395 400 405 



1555 



ctg caa gat caa acc ttg get teg aaa tct cca get tea tea tct gat 
Leu Gin Asp Gin Thr Leu Ala Ser Lys Ser Pro Ala Ser Ser Ser Asp 
410 415 420 



1603 



gat tea gat gag act gga gta acc aag eta aat gee gac tea aaa acc 
Asp Ser Asp Glu Thr Gly Val Thr Lys Leu Asn Ala Asp Ser Lys Thr 
425 430 435 



1651 



aat gat gat aaa att gag gag gtt gtt gtt act gee get gtg cat gac 
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Asn Asp Asp Lys He Glu Glu Val Val Val Thr Ala Ala Val His Asp 
440 445 450 

tea aac act gec cag aag aaa aat ctt gtg gac cgc tea teg tgt ggc 1747 
Ser Asn Thr Ala Gin Lys Lys Asn Leu Val Asp Arg Ser Ser Cys Gly 
455 460 465 470 

tea aat aca cct tea ggg agt gac gca gaa act gat gca tta gat aaa 1795 
Ser Asn Thr Pro Ser Gly Ser Asp Ala Glu Thr Asp Ala Leu Asp Lys 
475 460 485 

atg gag aaa gat aaa gag gat gtg aag gag aca gat gag aat cag cca 1843 
Met Glu Lys Asp Lys Glu Asp Val Lys Glu Thr Asp Glu Asn Gin Pro 
490 495 500 

gat gtt att gag tta aat aac cgt aag att aaa atg aga gac aac aac 1891 
Asp Val He Glu Leu Asn Asn Arg Lys He Lys Met Arg Asp Asn Asn 
505 510 515 

age aac aac aat gca act act gat teg tgg aag gaa gtc tec gaa gag 1939 
Ser Asn Asn Asn Ala Thr Thr Asp Ser Trp Lys Glu Val Ser Glu Glu 
520 525 530 



ggt cgt ata gcg ttt cag get etc ttt gca aga gaa aga ttg cct caa 
Gly Arg He Ala Phe Gin Ala Leu Phe Ala Arg Glu Arg Leu Pro Gin 
535 540 545 550 



gac acg tea atg cca ttg get cct aat ttc aaa age cag gat tct tgt 
Asp Thr Ser Met Pro Leu Ala Pro Asn Phe Lys Ser Gin Asp Ser Cys 
570 575 580 



gaa aaa gtc tgc aaa agg ctt cga ttg gaa gga gaa get tct aca tga 
Glu Lys Val Cys Lys Arg Leu Arg Leu Glu Gly Glu Ala Ser Thr 
635 640 645 



1987 



age ttt teg cct cct caa gtg gca gag aat gtg aat aga aaa caa agt 2035 
Ser Phe Ser Pro Pro Gin Val Ala Glu Asn Val Asn Arg Lys Gin Ser 
555 560 565 



2083 



get gca gac caa gaa gga gta gta atg ate ggt gtt gga aca tgc aag 2131 

Ala Ala Asp Gin Glu Gly Val Val Met He Gly Val Gly Thr Cys Lys 

585 590 595 

agt ctt aaa acg aga cag aca gga ttt aag cca tac aag aga tgt tea 2179 
Ser Leu Lys Thr Arg Gin Thr Gly Phe Lys Pro Tyr Lys Arg Cys Ser 
600 " 605 610 

atg gaa gtg aaa gag age caa gtt ggg aac ata aac aat caa agt gat 2227 

Met Glu Val Lys Glu Ser Gin Val Gly Asn He Asn Asn Gin Ser Asp 
615 " 620 625 630 



2275 



cagacttgga ggtaaaaaaa aaacatccac atttttatca atatctttaa atctagtgtt 2335 

agtagtttgc ttctccaatc tttatgaaag agacttttaa ttttccttcc gaacatttct 2395 

ttggtcatgt caggttctgt accatattac cccatgtctt gtctcttgtc tctgtttgtg 2455 

tatgetaett gtggtctata tgtcatctgc tactactgtt aattaaccat taagcaatgg 2515 

atttgtcttt a 2526 

<210> 30 

<211> 645 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 30 

Met Asp Thr Asn Thr Ser Gly Glu Glu Leu Leu Ala Lys Ala Arg Lys 
1 5 10 15 



Pro Tyr Thr He Thr Lys Gin Arg Glu Arg Trp Thr Glu Asp Glu His 
20 25 30 
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Glu Arg Phe Leu Glu Ala Leu Arg Leu Tyr Gly Arg Ala Trp Gin Arg 
35 40 45 

He Glu Glu His He Gly Thr Lys Thr Ala Val Gin He Arg Ser His 
50 55 60 

Ala Gin Lys Phe Phe Thr Lye Leu Glu Lys Glu Ala Glu Val Lys Gly 
65 70 75 80 

He Pro Val Cys Gin Ala Leu Asp He Glu He Pro Pro Pro Arg Pro 
85 90 95 

Lys Arg Lys Pro Asn Thr Pro Tyr Pro Arg LyB Pro Gly Asn Asn Gly 
100 105 110 

Thr Ser Ser Ser Gin Val Ser Ser Ala Lys Asp Ala Lys Leu Val Ser 
115 120 125 

Ser Ala Ser Ser Ser Gin Leu Asn Gin Ala Phe Leu Asp Leu Glu Lys 
130 135 140 

Met Pro Phe Ser Glu Lys Thr Ser Thr Gly Lys Glu Asn Gin Asp Glu 
145 150 155 160 

Asn Cys Ser Gly Val Ser Thr Val Asn Lys Tyr Pro Leu Pro Thr Lys 
165 170 175 



Gin Val Ser Gly Asp He Glu Thr Ser Lys Thr Ser Thr Val Asp Asn 
180 185 190 

Ala Val Gin Asp Val Pro Lys Lys Asn Lys Asp Lys Asp Gly Asn Asp 
195 200 205 

Gly Thr Thr Val His Ser Met Gin Asn Tyr Pro Trp His Phe His Ala 
210 215 220 

Asp He Val Asn Gly Asn He Ala Lys Cys Pro Gin Asn His Pro Ser 
225 230 235 240 

Gly Met Val Ser Gin Asp Phe Met Phe His Pro Met Arg Glu Glu Thr 
245 250 255 

His Gly His Ala Asn Leu Gin Ala Thr Thr Ala Ser Ala Thr Thr Thr 
260 265 270 

Ala Ser His Gin Ala Phe Pro Ala Cys His Ser Gin Asp Asp Tyr Arg 
275 280 285 

Ser Phe Leu Gin He Ser Ser Thr Phe Ser Asn Leu He Met Ser Thr 
290 295 300 

Leu Leu Gin Asn Pro Ala Ala His Ala Ala Ala Thr Phe Ala Ala Ser 
305 310 315 320 



Val Trp Pro Tyr Ala Ser Val Gly Asn Ser Gly Asp Ser Ser Thr Pro 
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325 330 335 

Met Ser Ser Ser Pro Pro Ser lie Thr Ala He Ala Ala Ala Thr Val 
340 345 350 

Ala Ala Ala Thr Ala Trp Tip Ala Ser His Gly Leu Leu Pro Val Cys 
355 360 365 

Ala Pro Ala Pro He Thr Cys Val Pro Phe Ser Thr Val Ala Val Pro 
370 375 380 

Thr Pro Ala Met Thr Glu Met Asp Thr Val Glu Asn Thr Gin Pro Phe 
385 390 395 400 

Glu Lys Gin Asn Thr Ala Leu Gin Asp Gin Thr Leu Ala Ser Lys Ser 
405 410 415 

Pro Ala Ser Ser Ser Asp Asp Ser Asp Glu Thr Gly Val Thr Lys Leu 
420 425 430 

Asn Ala Asp Ser Lys Thr Asn Asp Asp Lys He Glu Glu Val Val Val 
435 440 445 

Thr Ala Ala Val His Asp Ser Asn Thr Ala Gin Lys Lys Asn Leu Val 
450 455 460 

Asp Arg Ser Ser Cys Gly Ser Asn Thr Pro Ser Gly Ser Asp Ala Glu 
465 470 475 480 

Thr Asp Ala Leu Asp Lys Met Glu Lys Asp Lys Glu Asp Val Lys Glu 
485 490 495 

Thr Asp Glu Asn Gin Pro Asp Val He Glu Leu Asn Asn Arg Lys He 
500 505 510 

Lys Met Arg Asp Asn Asn Ser Asn Asn Asn Ala Thr Thr Asp Ser Trp 
515 520 525 

Lys Glu Val Ser Glu Glu Gly Arg He Ala Phe Gin Ala Leu Phe Ala 
530 535 540 

Arg Glu Arg Leu Pro Gin Ser Phe Ser Pro Pro Gin Val Ala Glu Asn 
545 " 550 555 560 

Val Asn Arg Lys Gin Ser Asp Thr Ser Met Pro Leu Ala Pro Asn Phe 
565 570 575 

Lys Ser Gin Asp Ser Cys Ala Ala Asp Gin Glu Gly Val Val Met He 
580 " 585 590 

Gly Val Gly Thr Cys Lys Ser Leu Lys Thr Arg Gin Thr Gly Phe Lys 
595 * 600 605 

Pro Tyr Lys Arg Cys Ser Met Glu Val Lys Glu Ser Gin Val Gly Asn 
610 " 615 620 
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He Asn Asn Gin Ser Asp Glu Lys Val Cys Lys Arg Leu Arg Leu Glu 
625 630 635 640 



Gly Glu Ala Ser Thr 
645 



<210> 


31 


<211> 


1195 


<212> 


DNA 


<213> 


Arabidopsis tha liana 


<220> 




<221> 


CDS 


<222> 


(67) . . (1041) 


<223> 


G883 


<400> 


31 



ctctctcgtc ttcgtcttct tcttcttcaa cgttcctctc caaaatcctc agaccaagaa 60 

atcatc atg gcc gtc gat eta atg cgt ttc cct aag ata gat gat caa 108 
Met Ala Val Asp Leu Met Arg Phe Pro Lys He Asp Asp Gin 
15 10 

acg get att cag gaa get gca teg caa ggt tta caa agt atg gaa cat 156 
Thr Ala He Gin Glu Ala Ala Ser Gin Gly Leu Gin Ser Met Glu His 
15 20 25 30 

ctg ate cgt gtc etc tct aac cgt ccc gaa caa caa cac aac gtt gac 204 
Leu He Arg Val Leu Ser Asn Arg Pro Glu Gin Gin His Asn Val Asp 
35 40 45 

tgc tec gag ate act gac ttc ace gtt tct aaa ttc aaa ace gtc att 252 
Cys Ser Glu He Thr Asp Phe Thr Val Ser Lys Phe Lys Thr Val He 
50 55 60 

tct etc ctt aac cgt act ggt cac get egg ttc aga cgc gga ccg gtt 300 
Ser Leu Leu Asn Arg Thr Gly His Ala Arg Phe Arg Arg Gly Pro Val 
65 70 75 

cac tec act tec tct gcc gca tct cag aaa eta cag agt cag ate gtt 34 8 

His Ser Thr Ser Ser Ala Ala Ser Gin Lys Leu Gin Ser Gin He Val 
80 85 90 

aaa aat act caa cct gag get ccg ata gtg aga aca act acg aat cac 3 96 

Lys Asn Thr Gin Pro Glu Ala Pro He Val Arg Thr Thr Thr Asn His 
95 100 105 110 

cct caa ate gtt cct cca ccg tct agt gta aca etc gat ttc tct aaa 444 
Pro Gin He Val Pro Pro Pro Ser Ser Val Thr Leu Asp Phe Ser Lys 
115 120 125 

cca age ate ttc ggc acc aaa get aag age gcc gag ctg gaa ttc tec 492 
Pro Ser He Phe Gly Thr Lys Ala Lys Ser Ala Glu Leu Glu Phe Ser 
130 * 135 140 

aaa gaa aac ttc agt gtt tct tta aac tec tea ttc atg teg teg gcg 540 
Lys Glu Asn Phe Ser Val Ser Leu Asn Ser Ser Phe Met Ser Ser Ala 
145 150 155 

ata acc gga gac ggc age gtc tec aat gga aaa ate ttc ctt get tct 588 
He Thr Gly Asp Gly Ser Val Ser Asn Gly Lys He Phe Leu Ala Ser 
160 165 170 

get ccg teg cag cct gtt aac tct tec gga aaa cca ccg ttg get ggt 636 
Ala Pro Ser Gin Pro Val Asn Ser Ser Gly Lys Pro Pro Leu Ala Gly 
175 180 185 190 

cat cct tac aga aag aga tgt etc gag cat gag cac tea gag agt ttc 684 
His Pro Tyr Arg Lys Arg Cys Leu Glu His Glu His Ser Glu Ser Phe 
195 200 205 

tec gga aaa gtc tec ggc tec gcc tac gga aag tgc cat tgc aag aaa 732 
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Ser Gly Lys Val Ser Gly Ser Ala Tyr Gly Lys Cys His Cys Lys Lys 
210 215 220 

agg aaa aat egg atg aag aga acc gtg aga gta ccg gcg ata agt gca 780 
Arg Lys Asn Arg Met Lys Arg Thr Val Arg Val Pro Ala lie Ser Ala 
225 230 235 

aag ate gee gat att cca ccg gac gaa tat teg tgg agg aag tac gga 828 
Lys lie Ala Asp He Pro Pro Asp Glu Tyr Ser Trp Arg Lys Tyr Gly 
240 245 250 

caa aaa ccg ate aag ggc tea cca cac cca cgt ggt tac tac aag tgc 876 
Gin Lys Pro He Lys Gly Ser Pro His Pro Arg Gly Tyr Tyr Lys Cys 
255 260 265 270 

agt aca ttc aga gga tgt cca gcg agg aaa cac gtg gaa cga gca tta 924 
Ser Thr Phe Arg Gly Cys Pro Ala Arg Lys His Val Glu Arg Ala Leu 
275 280 285 

gat gat cca gcg atg ctt att gtg aca tac gaa gga gag cac cgt cat 972 
Asp Asp Pro Ala Met Leu He Val Thr Tyr Glu Gly Glu His Arg His 
290 295 300 

aac caa tec gcg atg cag gag aat att tct tct tea ggc att aat gat 1020 
Asn Gin Ser Ala Met Gin Glu Asn He Ser Ser Ser Gly He Asn Asp 
305 310 315 

tta gtg ttt gee teg get tga cttttttttg tactatttgt tttttgattt 1071 
Leu Val Phe Ala Ser Ala 
320 

tttgagtact ttagatggat tgaaatttgt aaattttttt attaagaaat caatttaaat 1131 

agagaaaaat tagtggtggt gcaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1191 

aaaa 1195 

<210> 32 

<211> 324 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 32 

Met Ala Val Asp Leu Met Arg Phe Pro Lys He Asp Asp Gin Thr Ala 
1 5' 10 15 

He Gin Glu Ala Ala Ser Gin Gly Leu Gin Ser Met Glu His Leu He 
20 25 30 

Arg Val Leu Ser Asn Arg Pro Glu Gin Gin His Asn Val Asp Cys Ser 
35 40 45 

Glu He Thr Asp Phe Thr Val Ser Lys Phe Lys Thr Val He Ser Leu 
50 55 60 

Leu Asn Arg Thr Gly His Ala Arg Phe Arg Arg Gly Pro Val His Ser 
65 70 75 80 

Thr Ser Ser Ala Ala Ser Gin Lys Leu Gin Ser Gin He Val Lys Asn 
85 90 95 

Thr Gin Pro Glu Ala Pro He Val Arg Thr Thr Thr Asn His Pro Gin 
100 105 HO 

He Val Pro Pro Pro Ser Ser Val Thr Leu Asp Phe Ser Lys Pro Ser 
115 120 125 
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He Phe Gly Thr Lys Ala Lys Ser Ala Glu Leu Glu Phe Ser Lys Glu 
130 135 140 



ABn Phe Ser Val Ser Leu Asn Ser Ser Phe Met Ser Ser Ala He Thr 
145 150 155 160 



Gly Asp Gly Ser Val Ser Asn Gly Lys He Phe Leu Ala Ser Ala Pro 
165 170 175 



Ser Gin Pro Val Asn Ser Ser Gly Lys Pro Pro Leu Ala Gly His Pro 
180 185 190 

Tyr Arg Lys Arg Cys Leu Glu His Glu His Ser Glu Ser Phe Ser Gly 
195 200 205 

Lys Val Ser Gly Ser Ala Tyr Gly Lys Cys His Cys Lys Lys Arg Lys 
210 215 220 

Asn Arg Met Lys Arg Thr Val Arg Val Pro Ala He Ser Ala Lys He 
225 230 235 240 



Ala Asp He Pro Pro Asp Glu Tyr Ser Trp Arg Lys Tyr Gly Gin Lys 
245 250 255 

Pro He Lys Gly Ser Pro His Pro Arg Gly Tyr Tyr Lys Cys Ser Thr 
260 265 270 



Phe Arg Gly Cys Pro Ala Arg Lys His Val Glu Arg Ala Leu Asp Asp 
275 280 285 



Pro Ala Met Leu He Val Thr Tyr Glu Gly Glu His Arg His Asn Gin 

290 295 300 

Ser Ala Met Gin Glu Asn He Ser Ser Ser Gly He Asn Asp Leu Val 

305 310 315 320 



Phe Ala Ser Ala 



<210> 33 

<211> 1902 

<212> DNA 

<213> Arabidopsis thalinana 
<220> 

<221> CDS 

<222> (D..U902) 

<223> G1855 

<400> 33 

atg gcg aaa gag aac agt ggt cat cat cac caa aca gaa gca aga aga 48 

Met Ala Lys Glu Asn Ser Gly His His His Gin Thr Glu Ala Arg Arg 

1 5 * 10 15 

aag aaa eta act ttg att ctt ggt gta agt gga etc tgc att ttg ttc 96 
Lys Lys Leu Thr Leu He Leu Gly Val Ser Gly Leu Cys He Leu Phe 
20 25 30 

tat gtt tta ggt gca tgg caa gec aat acc gtc cca tct tct ate teg 144 
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Tyr Val Leu Gly Ala Trp Gin Ala Asn Thr Val Pro Ser Ser lie Ser 

35 - 40 45 

aag etc gga tgc gag acg caa tea aac cct tct teg tec tct tec tct 192 

Lye Leu Gly Cye Glu Thr Gin Ser Asn Pro Ser Ser Ser Ser Ser Ser 
50 55 60 

tec tea tct tea gag tea get gaa eta gat ttc aaa age cat aat cag 240 

Ser Ser Ser Ser Glu Ser Ala Glu Leu Asp Phe Lys Ser His Asn Gin 
65 70 75 80 

att gag tta aag gaa aca aac caa ace att aag tac ttt gaa cca tgt 288 

He Glu Leu Lys Glu Thr Asn Gin Thr He Lys Tyr Phe Glu Pro Cys 
85 90 95 

gaa tta tct etc agt gag tac act cct tgt gaa gac cga caa aga gga 336 

Glu Leu Ser Leu Ser Glu Tyr Thr Pro Cys Glu Asp Arg Gin Arg Gly 
100 105 HO 



aga aga ttc gat agg aac atg atg aaa tat aga gaa aga cat tgt cct 
Arg Arg Phe Asp Arg Asn Met Met Lys Tyr Arg Glu Arg His Cys Pro 
115 120 125 



gtt tta aga ccg ggc ggt tac tgg ate etc teg gga cca ccg att aac 
Val Leu Arg Pro Gly Gly Tyr Trp He Leu Ser Gly Pro Pro He Asn 
305 310 315 320 



384 



gta aaa gat gag ctt ctt tat tgt ttg att cct cct cca cca aac tac 432 
Val Lys Asp Glu Leu Leu Tyr Cys Leu He Pro Pro Pro Pro Asn Tyr 
130 135 140 

aag att cca ttt aaa tgg cca caa agt aga gac tat get tgg tat gac 480 
Lys He Pro Phe Lys Trp Pro Gin Ser Arg Asp Tyr Ala Trp Tyr Asp 
145 150 155 160 

aat ate cct cac aag gaa ctt agt gtt gag aaa gca gtt caa aac tgg 528 
Asn He Pro His Lys Glu Leu Ser Val Glu Lys Ala Val Gin Asn Trp 
165 170 175 

att caa gtt gaa ggt gac cgc ttt aga ttc cct ggt ggt ggt act atg 576 
He Gin Val Glu Gly Asp Arg Phe Arg Phe Pro Gly Gly Gly Thr Met 
180 185 190 

ttt cct cgt gga get gat get tat ate gat gat att get agg ctt att 624 
Phe Pro Arg Gly Ala Asp Ala Tyr He Asp Asp He Ala Arg Leu He 
195 200 205 

cct ctt act gat ggt gga ate aga aca get att gac act gga tgt ggt 672 
Pro Leu Thr Asp Gly Gly He Arg Thr Ala He Asp Thr Gly Cys Gly 
210 215 220 

gtt gca agt ttt ggt get tac etc ttg aag aga gac att atg get gtg 720 
Val Ala Ser Phe Gly Ala Tyr Leu Leu Lys Arg Asp He Met Ala Val 
225 230 235 240 

tct ttt get cca aga gac act cat gaa get cag gta cag ttt get tta 768 
Ser Phe Ala Pro Arg Asp Thr His Glu Ala Gin Val Gin Phe Ala Leu 
245 250 255 

gaa cgc gga gtt cct gcg ata ate ggg att atg gga tea aga aga ctt 816 
Glu Arg Gly Val Pro Ala He He Gly He Met Gly Ser Arg Arg Leu 
260 265 270 

cct tat cca get aga get ttt gat ctt get cat tgt tct cgt tgt ttg 864 
Pro Tyr Pro Ala Arg Ala Phe Asp Leu Ala His Cys Ser Arg Cys Leu 
275 280 285 

ate cct tgg ttt aaa aat gat ggt ttg tac ctt atg gag gtc gac egg 912 
He Pro Trp Phe Lys Asn Asp Gly Leu Tyr Leu Met Glu Val Asp Arg 
290 * 295 300 



960 



tgg aaa cag tac tgg aga ggg tgg gag aga aca gag gag gat ttg aag 1008 
Trp Lys Gin Tyr Trp Arg Gly Trp Glu Arg Thr Glu Glu Asp Leu Lys 
325 330 335 
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aaa gag caa gat tea ata gaa gat gta gca aag agt ctt tgc tgg aag 
Lys Glu Gin Asp Ser lie Glu Asp Val Ala Lys Ser Leu Cys Trp Lys 
340 345 " 350 



1056 



aaa gta act gaa aaa ggt gac tta tea att tgg caa aag cct etc aat 
Lys Val Thr Glu Lys Gly Asp Leu Ser lie Trp Gin Lys Pro Leu Asn 
355 360 365 



1104 



cac att gag tgt aaa aag etc aaa caa aac aat aag tea cct ccg ata 
His lie Glu Cys Lys Lys Leu Lys Gin Asn Asn Lys Ser Pro Pro lie 
370 375 380 



1152 



tgc age tea gat aac gcg gat tec get tgg tac aaa gac ttg gaa act 
Cys Ser Ser Asp Asn Ala Asp Ser Ala Trp Tyr Lys Asp Leu Glu Thr 
385 390 395 400 



1200 



tgt ata aca cca tta cca gaa aca aac aat cca gat gat tea gca ggc 
Cys He Thr Pro Leu Pro Glu Thr Asn Asn Pro Asp Asp Ser Ala Gly 
405 410 ~ 415 



1248 



ggt gca etc gag gat tgg cca gac cga gca ttc gcg gta cct cca aga 
Gly Ala Leu Glu Asp Trp Pro Asp Arg Ala Phe Ala Val Pro Pro Arg 
420 425 430 



1296 



ate ate aga gga act ata cca gaa atg aac gcg gag aaa ttt aga gaa 
He He Arg Gly Thr He Pro Glu Met Asn Ala Glu Lys Phe Arg Glu 
435 440 445 



1344 



gac aac gag gtt tgg aaa gag aga ata gca cat tac aag aag ata gtc 
Asp Asn Glu Val Trp Lys Glu Arg He Ala His Tyr Lys Lys He Val 
450 455 460 



1392 



cct gag ctt tea cat gga aga ttc agg aac att atg gac atg aac get 
Pro Glu Leu Ser His Gly Arg Phe Arg Asn He Met Asp Met Asn Ala 
465 470 475 * 480 



1440 



ttt etc ggc gga ttc get get tec atg ctg aaa tat ccc tea tgg gtc 
Phe Leu Gly Gly Phe Ala Ala Ser Met Leu Lys Tyr Pro Ser Trp Val 
485 490 495 



1488 



atg aac gtt gtc ccg gtc gat gca gag aaa caa acg tta ggt gtg ate 
Met Asn Val Val Pro Val Asp Ala Glu Lys Gin Thr Leu Gly Val He 
500 505 510 



1536 



tac gaa cgt gga ttg ata ggg acg tat caa gat tgg tgt gaa gga ttc 
Tyr Glu Arg Gly Leu He Gly Thr Tyr Gin Asp Trp Cys Glu Gly Phe 
515 520 525 



1584 



tea acg tat cca aga act tat gat atg att cat gca gga gga ttg ttc 
Ser Thr Tyr Pro Arg Thr Tyr Asp Met lie His Ala Gly Gly Leu Phe 
530 535 540 



1632 



age tta tac gaa cat agg tgt gat ttg acg ttg ata ttg ttg gag atg 
Ser Leu Tyr Glu His Arg Cys Asp Leu Thr Leu He Leu Leu Glu Met 
545 550 555 560 



1680 



gat cga att ttg aga cca gaa gga aca gtt gtg ttg aga gat aat gtg 
Asp Arg He Leu Arg Pro Glu Gly Thr Val Val Leu Arg Asp Asn Val 
565 570 575 



1728 



gag acg ttg aat aag gta gag aag ata gtg aag gga atg aag tgg aag 
Glu Thr Leu Asn Lys Val Glu Lys He Val Lys Gly Met Lys Trp Lys 
580 585 590 



1776 



agt caa att gtt gat cat gag aaa ggt cct ttt aat cct gag aag att 
Ser Gin He Val Asp His Glu Lys Gly Pro Phe Asn Pro Glu Lys He 
595 600 605 



1824 



ctt gtt get gtt aaa act tat tgg act ggt caa cct tct gac aag aac 
Leu Val Ala Val Lys Thr Tyr Trp Thr Gly Gin Pro Ser Asp Lys Asn 
610 615 620 



1872 



aac aac aac aac aac aac aac aac aac tag 
Asn Asn Asn Asn Asn Asn Asn Asn Asn 
625 630 



1902 
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<210> 34 
<211> 633 
<212> PRT 

<213> Arabidopsis thalinana 
<400> 34 

Met Ala Lys Glu Asn Ser Gly His His His Gin Thr Glu Ala Arg Arg 
15 10 15 



Lys Lys Leu Thr Leu lie Leu Gly Val Ser Gly Leu Cys lie Leu Phe 
20 25 30 



Tyr Val Leu Gly Ala Tip Gin Ala Asn Thr Val Pro Ser Ser He Ser 
35 40 45 



Lys Leu Gly Cys Glu Thr Gin Ser Asn Pro Ser Ser Ser Ser Ser Ser 
50 * 55 60 

Ser Ser Ser Ser Glu Ser Ala Glu Leu Asp Phe Lys Ser His Asn Gin 

65 70 75 80 

He Glu Leu Lys Glu Thr Asn Gin Thr He Lys Tyr Phe Glu Pro Cys 
85 90 95 



Glu Leu Ser Leu Ser Glu Tyr Thr Pro Cys Glu Asp Arg Gin Arg Gly 
100 105 no 



Arg Arg Phe Asp Arg Asn Met Met Lys Tyr Arg Glu Arg His Cys Pro 
115 120 125 



Val Lys Asp Glu Leu Leu Tyr Cys Leu He Pro Pro Pro Pro Asn Tyr 
130 135 140 



Lys He Pro Phe Lys Trp Pro Gin Ser Arg Asp Tyr Ala Trp Tyr Asp 
145 150 155 160 



Asn He Pro His Lys Glu Leu Ser Val Glu Lys Ala Val Gin Asn Trp 
165 170 175 



lie Gin Val Glu Gly Asp Arg Phe Arg Phe Pro Gly Gly Gly Thr Met 
180 185 190 



Phe Pro Arg Gly Ala Asp Ala Tyr He Asp Asp He Ala Arg Leu He 
195 200 205 



Pro Leu Thr Asp Gly Gly He Arg Thr Ala He Asp Thr Gly Cys Gly' 
210 215 220 



Val Ala Ser Phe Gly Ala Tyr Leu Leu Lys Arg Asp He Met Ala Val 
225 230 235 240 



Ser Phe Ala Pro Arg Asp Thr His Glu Ala Gin Val Gin Phe Ala Leu 
245 250 255 

Glu Arg Gly Val Pro Ala He He Gly He Met Gly Ser Arg Arg Leu 
260 265 270 
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Pro Tyr Pro Ala Arg Ala Phe Asp Leu Ala His Cys Ser Arg Cys Leu 
275 280 285 



lie Pro Trp Phe Lys Asn Asp Gly Leu Tyr Leu Met Glu Val Asp Arg 
2 90 295 300 



Val Leu Arg Pro Gly Gly Tyr Trp lie Leu Ser Gly Pro Pro lie Asn 
305 310 315 320 



Trp Lys Gin Tyr Trp Arg Gly Trp Glu Arg Thr Glu Glu Asp Leu Lys 
325 330 335 



Lys Glu Gin Asp Ser lie Glu Asp Val Ala Lys Ser Leu Cys Trp Lys 
340 345 350 



Lys Val Thr Glu Lys Gly Asp Leu Ser He Trp Gin Lys Pro Leu Asn 
355 360 365 



His He Glu Cys Lys Lys Leu Lys Gin Asn Asn Lys Ser Pro Pro He 
370 375 380 



Cys Ser Ser Asp Asn Ala Asp Ser Ala Trp Tyr Lys Asp Leu Glu Thr 
385 390 395 400 



Cys He Thr Pro Leu Pro Glu Thr Asn Asn Pro Asp Asp Ser Ala Gly 
405 410 415 



Gly Ala Leu Glu Asp Trp Pro Asp Arg Ala Phe Ala Val Pro Pro Arg 
420 425 430 



He He Arg Gly Thr He Pro Glu Met Asn Ala Glu Lys Phe Arg Glu 
435 440 445 



Asp Asn Glu Val Trp Lys Glu Arg He Ala His Tyr Lys Lys He Val 
450 455 460 



Pro Glu Leu Ser His Gly Arg Phe Arg Asn He Met Asp Met Asn Ala 
465 470 * 475 " 480 



Phe Leu Gly Gly Phe Ala Ala Ser Met Leu Lys Tyr Pro Ser Trp Val 
485 490 - 495 



Met Asn Val Val Pro Val Asp Ala Glu Lys Gin Thr Leu Gly Val He 
500 505 510 



Tyr Glu Arg Gly Leu He Gly Thr .Tyr Gin Asp Trp Cys Glu Gly Phe 
515 520 525 



Ser Thr Tyr Pro Arg Thr Tyr Asp Met He His Ala Gly Gly Leu Phe 
530 535 540 



Ser Leu Tyr Glu His Arg Cys Asp Leu Thr Leu He Leu Leu Glu Met 
545 550 555 560 



Asp Arg He Leu Arg Pro Glu Gly Thr Val Val Leu Arg Asp Asn Val 
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565 570 575 

Glu Thr Leu Asn Lys Val Glu Lys lie Val Lys Gly Met Lys Trp Lye 
580 585 590 

Ser Gin He Val Asp His Glu Lys Gly Pro Phe Asn Pro Glu Lys He 
595 600 605 

Leu Val Ala Val Lys Thr Tyr Trp Thr Gly Gin Pro Ser Asp Lys Asn 
610 615 620 



Asn Asn Asn Asn Asn Asn Asn Asn Asn 



625 


630 


<210> 


35 


<211> 


2324 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(209) . . (2020) 


<223> 


G1190 


<400> 


35 



tcctgtccca aaaccaaaag gcttgagagt gtgtctttag agagagatct tctctctttt 60 

atcttacgac tctcacttct tatctcaaat ctacttcaac tctatttcca gtctccacat 120 

tttcccacaa atttcaactc ttgttctctt catccaaagt aaaaaacaaa tcgttgcaag 180 

tgaggtttgg ttttggtgtt atagaatt atg aag age ggg aag caa tct teg 232 

Met Lys Ser Gly Lys Gin Ser Ser 
1 5 

caa cct gaa aag ggt act tec agg ate ttg tea ctg act gtc ctg ttt 280 
Gin Pro Glu Lys Gly Thr Ser Arg He Leu Ser Leu Thr Val Leu Phe 
10 * 15 20 

ate gca ttt tgc ggt ttc tec ttc tac etc ggt ggt ata ttt tgc tct 328 
He Ala Phe Cys Gly Phe Ser Phe Tyr Leu Gly Gly He Phe Cys Ser 
25 30 35 40 

gag aga gac aag att gta gee aag gat gtc aca agg acg act aca aag 376 
Glu Arg Asp Lys lie Val Ala Lys Asp Val Thr Arg Thr Thr Thr Lys 
45 50 55 

get gta get tec cct aaa gaa cct aca get act cct att caa ate aaa 424 
Ala Val Ala Ser Pro Lys Glu Pro Thr Ala Thr Pro He Gin He Lys 
60 65 70 

tec gtt tct ttc ccg gag tgc ggg tea gag ttc caa gat tac acc ccg 472 
Ser Val Ser Phe Pro Glu Cys Gly Ser Glu Phe Gin Asp Tyr Thr Pro 
75 80 85 



tgc acc gat cca aag agg tgg aag aag tat ggt gtc cat cgc tta agt 

Cys Thr Asp Pro Lys Arg Trp Lys Lys Tyr Gly Val His Arg Leu Ser 

90 95 100 

ttc ttg gag cgt cat tgt cct ccg gta tat gaa aag aat gag tgt ttg 

Phe Leu Glu Arg His Cys Pro Pro Val Tyr Glu Lys Asn Glu Cys Leu 

105 " 110 115 120 



520 



568 



att cca cca cca gac ggg tat aaa ccg cct ata aga tgg ccc aag age 616 
He Pro Pro Pro Asp Gly Tyr Lys Pro Pro He Arg Trp Pro Lys Ser 
125 130 135 

cga gaa cag tgt tgg tac agg aac gtg cct tat gat tgg ate aat aag 664 
Arg Glu Gin Cys Trp Tyr Arg Asn Val Pro Tyr Asp Trp He Asn Lys 
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145 150 



caa aag tct aac cag cat tgg ctt aag aaa gaa gga gat aag ttc cat 
Gin Lys Ser Asn Gin His Trp Leu Lys Lys Glu Gly Asp Lys Phe His 
155 160 165 



712 



ttc cct ggt ggt ggt acc atg ttc cct cgt gga gtt agt cac tat gtt 
Phe Pro Gly Gly Gly Thr Met Phe Pro Arg Gly Val Ser His Tyr Val 
170 175 180 



760 



gat ttg atg caa gat ctg att cct gaa atg aaa gac gga aca gtc agg 
Asp Leu Met Gin Asp Leu lie Pro Glu Met Lys Asp Gly Thr Val Arg 
185 190 195 200 



808 



acc gcc att gat act ggc tgt ggg gtt gcg age tgg gga ggc gat ctt 
Thr Ala lie Asp Thr Gly Cys Gly Val Ala Ser Trp Gly Gly Asp Leu 
205 210 * 215 



656 



ttg gac cgt ggg ata eta tea etc tct ctt get cca aga gat aac cat 
Leu Asp Arg Gly He Leu Ser Leu Ser Leu Ala Pro Arg Asp Asn His 
220 225 230 



904 



gaa get cag gtt caa ttt get ctt gaa cgt gga att cct gcg att etc 
Glu Ala Gin Val Gin Phe Ala Leu Glu Arg Gly He Pro Ala He Leu 
235 240 245 



952 



ggg ate ate tct acg caa cgt etc cct ttt cct tea aat gca ttt gat 
Gly He He Ser Thr Gin Arg Leu Pro Phe Pro Ser Asn Ala Phe Asp 
250 255 260 



1000 



atg get cat tgt tea aga tgt ctt att ccc tgg aca gaa ttt ggt gga 
Met Ala His Cys Ser Arg Cys Leu He Pro Trp Thr Glu Phe Gly Gly 
265 270 275 280 



1048 



ate tat tta ctt gag att cac cgt ata gtt cga cct gga ggt ttt tgg 
He Tyr Leu Leu Glu He His Arg He Val Arg Pro Gly Gly Phe Trp 
285 290 295 



1096 



gtt ctt tct ggt cca cct gtg aac tat aat aga cga tgg cgt gga tgg 
Val Leu Ser Gly Pro Pro Val Asn Tyr Asn Arg Arg Trp Arg Gly Trp 
300 305 ^ 310 



1144 



aac aca acc atg gaa gat cag aaa tct gac tac aac aag ctt cag tea 
Aen Thr Thr Met Glu Asp Gin Lys Ser Asp Tyr Asn Lys Leu Gin Ser 
315 320 325 



1192 



ctt eta acc tec atg tgt ttc aaa aag tac get caa aaa gat gac ata 
Leu Leu Thr Ser Met Cys Phe Lys Lys Tyr Ala Gin Lys Asp Asp He 
330 335 340 



1240 



gcc gtg tgg cag aaa etc tea gac aaa tct tgc tat gac aaa ate get 
Ala Val Trp Gin Lys Leu Ser Asp Lys Ser Cys Tyr Asp Lys He Ala 
345 350 355 360 



1288 



aag aac atg gaa get tac cct ccc aaa tgt gac gac agt ata gaa cct 
Lys Asn Met Glu Ala Tyr Pro Pro Lys Cys Asp Asp Ser He Glu Pro 
365 370 375 



1336 



gat tct get tgg tac act cca etc cgt cct tgc gtg gtt gcc ccg aca 
Asp Ser Ala Trp Tyr Thr Pro Leu Arg Pro Cys Val Val Ala Pro Thr 
380 385 390 



1384 



cct aaa gtc aag aag tct ggt etc gga tea ate cca aaa tgg ccc gag 
Pro Lys Val Lys Lys Ser Gly Leu Gly Ser He Pro Lys Trp Pro Glu 
395 400 405 



1432 



agg tta cat gtc gcg ccc gag aga ate ggt gat gtt cac gga ggg agt 
Arg Leu His Val Ala Pro Glu Arg He Gly Asp Val His Gly Gly Ser 
410 415 420 



1480 



gcg aac agt ttg aaa cac gat gat ggt aaa tgg aag aac aga gtt aag 
Ala Asn Ser Leu Lys His Asp Asp Gly Lys Trp Lys Asn Arg Val Lys 
425 430 435 "* 440 



1528 



cat tac aag aaa gtt tta cca get ctt ggg aca gac aag ata aga aat 
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His Tyr Lys Lys Val Leu Pro Ala Leu Gly Thr Asp Lys He Arg Asn 
445 450 455 

gtt atg gat atg aac act gtt tat gga ggt ttc tct gcg gcc etc att 1624 

Val Met Asp Met Asn Thr Val Tyr Gly Gly Phe Ser Ala Ala Leu He 
460 465 470 

gag gat ccc att tgg gtc atg aac gtt gta tea teg tac age gca aat 1672 

Glu Asp Pro He Trp Val Met Asn Val Val Ser Ser Tyr Ser Ala Asn 
475 480 485 

teg ctt cct gtt gtc ttt gat cgc ggt etc ate ggg act tac cac gac 1720 

Ser Leu Pro Val Val Phe Asp Arg Gly Leu He Gly Thr Tyr His Asp 
490 495 500 

tgg tgc gaa get ttc tea acg tat cca aga aca tat gat ctt ctt cac 1768 

Trp Cys Glu Ala Phe Ser Thr Tyr Pro Arg Thr Tyr Asp Leu Leu His 

505 510 515 520 

etc gac agt ctt ttt ace ttg gag agt cac agg tgt gag atg aag tac 1816 

Leu Asp Ser Leu Phe Thr Leu Glu Ser His Arg Cys Glu Met Lys Tyr 
525 530 535 

att ttg eta gag atg gac agg ate ttg egg ccg agt gga tat gtt ata 1864 

He Leu Leu Glu Met Asp Arg He Leu Arg Pro Ser Gly Tyr Val He 
540 545 550 

ate cga gaa teg agt tat ttc atg gac gca ate aca acg tta gcg aaa 1912 

He Arg Glu Ser Ser Tyr Phe Met Asp Ala He Thr Thr Leu Ala Lys 
555 560 565 

ggg ata agg tgg agt tgc egg aga gag gag act gag tat gca gtc aaa 1960 

Gly He Arg Trp Ser Cys Arg Arg Glu Glu Thr Glu Tyr Ala Val Lys 
570 575 580 



agt gag aag att ctg gtt tgc cag aaa aag eta tgg ttt teg tea aac 
Ser Glu Lys He Leu Val Cys Gin Lys Lys Leu Trp Phe Ser Ser Asn 
585 ' 590 595 600 



2008 



caa acc tct tga tgagaccacc tgtatcatag tgtttatcat ctcctgtgat 2060 
Gin Thr Ser 

gcacactaca gagagaagga tctagtcctt tgagtccaag atatagctct ataaacaatc 2120 

tccttttttt gttctcttta atttcttggg tatttcaegg tatagattga tattatatat 2180 

tttttaatta tatttttaat atatagatat attagtatgt ggtttaaaca ctattattat 2240 

caaggtctta aagatttget ttgcaagagt taaaaaatgt tggagtaagg acctcttgat 2300 

taataaattg actgaegcag caaa 2324 

<210> 36 

<211> 603 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 36 

Met Lys Ser Gly Lys Gin Ser Ser Gin Pro Glu Lys Gly Thr Ser Arg 
1 5 10 .15 

He Leu Ser Leu Thr Val Leu Phe He Ala Phe Cys Gly Phe Ser Phe 
20 25 30 

Tyr Leu Gly Gly He Phe Cys Ser Glu Arg Asp Lys He Val Ala Lys 
35 40 45 

Asp Val Thr Arg Thr Thr Thr Lys Ala Val Ala Ser Pro Lys Glu Pro 
50 55 60 

Page 55 



WO 01/36597 



MBI-20 Sequence Listing, ST25 



PCT/US00/31344 



Thr Ala Thr Pro He Gin He Lys Ser Val Ser Phe Pro Glu Cys Gly 
65 70 75 80 



Ser Glu Phe Gin Asp Tyr Thr Pro Cys Thr Asp Pro Lys Arg Trp Lys 
85 90 95 



Lys Tyr Gly Val His Arg Leu Ser Phe Leu Glu Arg His Cys Pro Pro 
100 " 105 110 



Val Tyr Glu Lys Asn Glu Cys Leu He Pro Pro Pro Asp Gly Tyr Lys 
115 120 125 



Pro Pro He Arg Trp Pro Lys Ser Arg Glu Gin Cys Trp Tyr Arg Asn 
130 135 ~* 140 



val Pro Tyr Asp Trp He Asn Lys Gin Lys Ser Asn Gin His Trp Leu 
145 150 155 160 



Lys Lys Glu Gly Asp Lys Phe His Phe Pro Gly Gly Gly Thr Met Phe 
165 170 175 



Pro Arg Gly Val Ser His Tyr Val Asp Leu Met Gin Asp Leu He Pro 
180 185 190 



Glu Met Lys Asp Gly Thr Val Arg Thr Ala He Asp Thr Gly Cys Gly 
195 200 205 



Val Ala Ser Trp Gly Gly Asp Leu Leu Asp Arg Gly He Leu Ser Leu 
210 215 220 



Ser Leu Ala Pro Arg Asp Asn His Glu Ala Gin Val Gin Phe Ala Leu 
225 230 235 240 



Glu Arg Gly He Pro Ala He Leu Gly He He Ser Thr Gin Arg Leu 
245 250 255 



Pro Phe Pro Ser Asn Ala Phe Asp Met Ala His Cys Ser Arg Cys Leu 
260 265 270 



He Pro Trp Thr Glu Phe Gly Gly He Tyr Leu Leu Glu He His Arg 
275 280 285 



He Val Arg Pro Gly Gly Phe Trp Val Leu Ser Gly Pro Pro Val Asn 
290 * 295 300 



Tyr Asn Arg Arg Trp Arg Gly Trp Asn Thr Thr Met Glu Asp Gin Lys 
305 310 315 320 



Ser Asp Tyr Asn Lys Leu Gin Ser Leu Leu Thr Ser Met Cys Phe LyB 
325 330 335 



Lys Tyr Ala Gin Lys Asp Asp He Ala Val Trp Gin Lys Leu Ser Asp 
340 345 350 



Lys Ser Cys Tyr Asp Lys He Ala Lys Asn Met Glu Ala Tyr Pro Pro 
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355 



MBI-20 Sequence Listing. ST25 
360 365 



Lys Cys Asp Asp Ser lie Glu Pro Asp Ser Ala Trp Tyr Thr Pro Leu 
370 375 380 



Arg Pro Cys Val Val Ala Pro Thr Pro Lys Val Lys Lys Ser Gly Leu 
385 * 390 395 400 



Gly Ser He Pro Lys Trp Pro Glu Arg Leu His Val Ala Pro Glu Arg 
405 410 415 



He Gly Asp Val His Gly Gly Ser Ala Asn Ser Leu Lys His Asp Asp 
420 425 430 



Gly Lys Trp Lys Asn Arg Val Lys His Tyr Lys Lys Val Leu Pro Ala 
435 440 445 



Leu Gly Thr Asp Lys He Arg Asn Val Met Asp Met Asn Thr Val Tyr 
450 455 460 



Gly Gly Phe Ser Ala Ala Leu He Glu Asp Pro He Trp Val Met Asn 
465 470 475 480 



Val Val Ser Ser Tyr Ser Ala Asn Ser Leu Pro Val Val Phe Asp Arg 
485 490 495 



Gly Leu He Gly Thr Tyr His Asp Trp Cys Glu Ala Phe Ser Thr Tyr 
500 505 510 



Pro Arg Thr Tyr Asp Leu Leu His Leu Asp Ser Leu Phe Thr Leu Glu 
515 520 525 



Ser His Arg Cys Glu Met Lys Tyr He Leu Leu Glu Met Asp Arg He 
530 535 540 



Leu Arg Pro Ser Gly Tyr Val He He Arg Glu Ser Ser Tyr Phe Met 
545 550 555 560 



Asp Ala He Thr Thr Leu Ala Lys Gly He Arg Trp Ser Cys Arg Arg 
565 570 575 



Glu Glu Thr Glu Tyr Ala Val Lys Ser Glu Lys He Leu Val Cys Gin 
580 585 590 



Lys Lys Leu Trp Phe Ser Ser Asn Gin Thr Ser 



<210> 37 

<211> 1951 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (196) . . (1794) 

<223> G308 



595 



600 



<400> 37 
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agtaatttag tttttttttt ttttttttac aatttatttt gttattagaa gtggtagtgg 60 

agtgaaaaaa caaatcctaa gcagtcctaa ccgatccccg aagctaaaga ttcttcacct 120 

tcccaaataa agcaaaacct agatccgaca ttgaaggaaa aaccttttag atccatctct 180 

gaaaaaaacc caacc atg aag aga gat cat cat cat cat cat caa gat aag 231 
Met Lys Arg Asp His His His His His Gin Asp Lys 
15 10 

aag act atg atg atg aat gaa gaa gac gac ggt aac ggc atg gat gag 279 
Lys Thr Met Met Met Asn Glu Glu Asp Asp Gly Asn Gly Met Asp Glu 
15 20 25 

ctt eta get gtt ctt ggt tac aag gtt agg tea teg gaa atg get gat 327 
Leu Leu Ala Val Leu Gly Tyr Lys Val Arg Ser Ser Glu Met Ala Asp 
30 35 40 

gtt get cag aaa etc gag cag ctt gaa gtt atg atg tct aat gtt caa 375 
Val Ala Gin Lys Leu Glu Gin Leu Glu Val Met Met Ser Asn Val Gin 
45 50 55 60 

gaa gac gat ctt tct caa etc get act gag act gtt cac tat aat ccg 423 
Glu Asp Asp Leu Ser Gin Leu Ala Thr Glu Thr Val His Tyr Asn Pro 
65 70 75 

gcg gag ctt tac acg tgg ctt gat tct atg etc ace gac ctt aat cct 471 
Ala Glu Leu Tyr Thr Trp Leu Asp Ser Met Leu Thr Asp Leu Asn Pro 
80 85 90 

ccg teg tct aac gec gag tac gat ctt aaa get att ccc ggt gac gcg 519 
Pro Ser Ser Asn Ala Glu Tyr Asp Leu Lys Ala lie Pro Gly Asp Ala 
95 100 105 

att etc aat cag ttc get ate gat teg get tct teg tct aac caa ggc 567 
lie Leu Asn Gin Phe Ala He Asp Ser Ala Ser Ser Ser Asn Gin Gly 
110 115 120 

ggc gga gga gat acg tat act aca aac aag egg ttg aaa tgc tea aac 615 
Gly Gly Gly Asp Thr Tyr Thr Thr Asn Lys Arg Leu Lys Cys Ser Asn 
125 130 135 140 

ggc gtc gtg gaa acc ace aca gcg acg get gag tea act egg cat gtt 663 
Gly Val Val Glu Thr Thr Thr Ala Thr Ala Glu Ser Thr Arg His Val 
145 150 155 

gtc ctg gtt gac teg cag gag aac ggt gtg cgt etc gtt cac gcg ctt 711 
Val Leu Val Asp Ser Gin Glu Asn Gly Val Arg Leu Val His Ala Leu 
160 165 170 

ttg get tgc get gaa get gtt cag aag gag aat ctg act gtg gcg gaa 759 
Leu Ala Cys Ala Glu Ala Val Gin Lys Glu Asn Leu Thr Val Ala Glu 
175 180 185 

get ctg gtg aag caa ate gga ttc tta get gtt tct caa ate gga get 807 
Ala Leu Val Lys Gin He Gly Phe Leu Ala Val Ser Gin He Gly Ala 
190 195 200 

atg aga caa gtc get act tac ttc gee gaa get etc gcg egg egg att 855 
Met Arg Gin Val Ala Thr Tyr Phe Ala Glu Ala Leu Ala Arg Arg He 
205 210 215 220 

tac cgt etc tct ccg teg cag agt cca ate gac cac tct etc tec gat 903 
Tyr Arg Leu Ser Pro Ser Gin Ser Pro He Asp His Ser Leu Ser Asp 
225 230 235 

act ctt cag atg cac ttc tac gag act tgt cct tat etc aag ttc get 951 
Thr Leu Gin Met His Phe Tyr Glu Thr Cys Pro Tyr Leu Lys Phe Ala 
240 245 250 

cac ttc acg gcg aat caa gcg att etc gaa get ttt caa ggg aag aaa 999 
His Phe Thr Ala Asn Gin Ala He Leu Glu Ala Phe Gin Gly Lys Lys 
255 260 265 

aga gtt cat gtc att gat ttc tct atg agt caa ggt ctt caa tgg ccg 1047 
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Arg Val His Val lie Asp Phe Ser Met Ser Gin Gly Leu Gin Trp Pro 
270 275 280 

gcg ctt atg cag get ctt gcg ctt cga cct ggt ggt cct cct gtt ttc 1095 
Ala Leu Met Gin Ala Leu Ala Leu Arg Pro Gly Gly Pro Pro Val Phe 
285 290 295 300 

egg tta ace gga att ggt cca ccg gca ccg gat aat ttc gat tat ctt 1143 
Arg Leu Thr Gly lie Gly Pro Pro Ala Pro Asp Asn Phe Asp Tyr Leu 
305 310 315 

cat gaa gtt ggg tgt aag ctg get cat tta get gag gcg att cac gtt 1191 
His Glu Val Gly Cys Lys Leu Ala His Leu Ala Glu Ala He Hie Val 
320 325 330 

gag ttt gag tac aga gga ttt gtg get aac act tta get gat ctt gat 1239 
Glu Phe Glu Tyr Arg Gly Phe Val Ala Asn Thr Leu Ala Asp Leu Asp 
335 340 345 

get teg atg ctt gag ctt aga cca agt gag att gaa tct gtt gcg gtt 1287 
Ala Ser Met Leu Glu Leu Arg Pro Ser Glu He Glu Ser Val Ala Val 
350 355 360 

aac tct gtt ttc gag ctt cac aag etc ttg gga cga cct ggt gcg ate 1335 
Asn Ser Val Phe Glu Leu His Lys Leu Leu Gly Arg Pro Gly Ala He 
365 370 375 380 

gat aag gtt ctt ggt gtg gtg aat cag att aaa ccg gag att ttc act 1383 
Asp Lys Val Leu Gly Val Val Asn Gin He Lys Pro Glu He Phe Thr 
385 390 395 

gtg gtt gag cag gaa teg aac cat aat agt ccg att ttc tta gat egg 1431 
Val Val Glu Gin Glu Ser Asn His Asn Ser Pro He Phe Leu Asp Arg 
400 405 410 

ttt act gag teg ttg cat tat tac teg acg ttg ttt gac teg ttg gaa 1479 
Phe Thr Glu Ser Leu His Tyr Tyr Ser Thr Leu Phe Asp Ser Leu Glu 
415 420 425 

ggt gta ccg agt ggt caa gac aag gtc atg teg gag gtt tac ttg ggt 1527 
Gly Val Pro Ser Gly Gin Asp Lys Val Met Ser Glu Val Tyr Leu Gly 
430 435 440 

aaa cag ate tgc aac gtt gtg get tgt gat gga cct gac cga gtt gag 1575 
Lys Gin He Cys Asn Val Val Ala Cys Asp Gly Pro Asp Arg Val Glu 
445 450 455 460 

cgt cat gaa acg ttg agt cag tgg agg aac egg ttc ggg tct get ggg 1623 
Arg His Glu Thr Leu Ser Gin Trp Arg Asn Arg Phe Gly Ser Ala Gly 
465 470 475 

ttt gcg get gca cat att ggt teg aat gcg ttt aag caa gcg agt atg 1671 
Phe Ala Ala Ala His He Gly Ser Asn Ala Phe Lys Gin Ala Ser Met 
480 485 490 

ctt ttg get ctg ttc aac ggc ggt gag ggt tat egg gtg gag gag agt 1719 
Leu Leu Ala Leu Phe Asn Gly Gly Glu Gly Tyr Arg Val Glu Glu Ser 
495 500 505 

gac ggc tgt etc atg ttg ggt tgg cac aca cga ccg etc ata gee ace 1767 
Asp Gly Cys Leu Met Leu Gly Trp His Thr Arg Pro Leu He Ala Thr 
510 515 520 

teg get tgg aaa etc tec ace aat tag atggtggctc aatgaattga 1814 
Ser Ala Trp Lys Leu Ser Thr Asn 
525 530 

tctgttgaac eggttatgat gatagatttc cgaccgaagc caaactaaat cctactgttt 1874 

ttccctttgt cacttgttaa gatcttatct ttcattatat taggtaattg aaaaatttta 1934 

atctcgccta aattact 1951 

<210> 38 
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<211> 532 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 38 

Met Lys Arg Asp His His His His His Gin Asp Lys Lys Thr Met Met 
1 5 10 15 



Met Asn Glu Glu Asp Asp Gly Asn Gly Met Asp Glu Leu Leu Ala Val 
20 25 30 



Leu Gly Tyr Lys Val Arg Ser Ser Glu Met Ala Asp Val Ala Gin Lys 
35 40 45 



Leu Glu Gin Leu Glu Val Met Met Ser Asn Val Gin Glu Asp Asp Leu 
50 55 60 



Ser Gin Leu Ala Thr Glu Thr Val His Tyr Asn Pro Ala Glu Leu Tyr 
65 70 75 80 



Thr Trp Leu Asp Ser Met Leu Thr Asp Leu Asn Pro Pro Ser Ser Asn 
85 90 95 



Ala Glu Tyr Asp Leu Lys Ala lie Pro Gly Asp Ala lie Leu Asn Gin 
100 105 110 



Phe Ala lie Asp Ser Ala Ser Ser Ser Asn Gin Gly Gly Gly Gly Asp 
115 120 125 



Thr Tyr Thr Thr Asn Lys Arg Leu Lys Cys Ser Asn Gly Val Val Glu 
130 135 140 



Thr Thr Thr Ala Thr Ala Glu Ser Thr Arg His Val Val Leu Val Asp 
145 150 155 160 



Ser Gin Glu Asn Gly Val Arg Leu Val His Ala Leu Leu Ala Cys Ala 
165 " 170 175 



Glu Ala Val Gin Lys Glu Asn Leu Thr Val Ala Glu Ala Leu Val Lys 
180 185 190 



Gin He Gly Phe Leu Ala Val Ser Gin He Gly Ala Met Arg Gin Val 
195 200 205 



Ala Thr Tyr Phe Ala Glu Ala Leu Ala Arg Arg He Tyr Arg Leu Ser 
210 215 220 



Pro Ser Gin Ser Pro He Asp His Ser Leu Ser Asp Thr Leu Gin Met 
225 230 235 240 

His Phe Tyr Glu Thr Cys Pro Tyr Leu Lys Phe Ala His Phe Thr Ala 
245 250 255 



Asn Gin Ala He Leu Glu Ala Phe Gin Gly Lys Lys Arg Val His Val 
260 265 270 



He Asp Phe Ser Met Ser Gin Gly Leu Gin Trp Pro Ala Leu Met Gin 
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275 280 285 



Ala Leu Ala Leu Arg Pro Gly Gly Pro Pro Val Phe Arg Leu Thr Gly 
290 295 300 



lie Gly Pro Pro Ala Pro Asp Asn Phe Asp Tyr Leu His Glu Val Gly 
305 310 315 320 



Cys Lys Leu Ala His Leu Ala Glu Ala He His Val Glu Phe Glu Tyr 
325 330 335 



Arg Gly Phe Val Ala Asn Thr Leu Ala Asp Leu Asp Ala Ser Met Leu 
340 345 350 



Glu Leu Arg Pro Ser Glu He Glu Ser Val Ala Val Asn Ser Val Phe 
355 360 365 

Glu Leu His Lys Leu Leu Gly Arg Pro Gly Ala He Asp Lys Val Leu 
370 375 380 

Gly Val Val Asn Gin He Lys Pro Glu He Phe Thr Val Val Glu Gin 
385 390 395 400 

Glu Ser Asn His Asn Ser Pro He Phe Leu Asp Arg Phe Thr Glu Ser 
405 410 415 

Leu His Tyr Tyr Ser Thr Leu Phe Asp Ser Leu Glu Gly Val Pro Ser 
420 425 430 

Gly Gin Asp Lys Val Met Ser Glu Val Tyr Leu Gly Lys Gin He Cys 
435 440 445 

Asn Val Val Ala Cys Asp Gly Pro Asp Arg Val Glu Arg His Glu Thr 
450 455 460 

Leu Ser Gin Trp Arg Asn Arg Phe Gly Ser Ala Gly Phe Ala Ala Ala 
465 470 475 480 

His He Gly Ser Asn Ala Phe Lys Gin Ala Ser Met Leu Leu Ala Leu 
485 490 495 

Phe Asn Gly Gly Glu Gly Tyr Arg Val Glu Glu Ser Asp Gly Cys Leu 
500 505 510 



Met Leu Gly Trp His Thr Arg Pro Leu He Ala Thr Ser Ala Trp Lys 
515 520 525 



Leu Ser Thr Asn 
530 



<210> 39 

<211> 1445 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (236) . . (1306) 
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<223> G1944 
<400> 39 

tcgaccttcc taatttccaa cctctgttct tagcaatata ttttttctcc aaaaataatt 60 

ctcagtttga ttttcttctt ctagctctta agtatatttc tttgttgtta tttatctttt 120 

aatcctttaa tctcatcttt gtttatcttt aatcaaaacc caaaatttac atgggttctt 180 

gaaaatctag aagaaataaa ggaaacataa caaaaataga aagaaaaaga agcta atg 238 

Met 
1 

gtc tta aat atg gag tct acc gga gaa get gtt aga tea ace ace ggt 286 
Val Leu Asn Met Glu Ser Thr Gly Glu Ala Val Arg Ser Thr Thr Gly 
5 10 15 

aac gac ggt ggt att acg gtg gtt aga tec gac gcg ccg tea gat ttc 334 
Asn Asp Gly Gly He Thr Val Val Arg Ser Asp Ala Pro Ser Asp Phe 
20 25 30 

cac gta get caa aga tea gaa age tea aac caa tct ccc acc tct gtc 382 
His Val Ala Gin Arg Ser Glu Ser Ser Asn Gin Ser Pro Thr Ser Val 
35 40 45 

act cct cct cca cca cag cca teg tct cat cac aca get cct ccg ccg 430 
Thr Pro Pro Pro Pro Gin Pro Ser Ser His His Thr Ala Pro Pro Pro 
50 55 60 65 

ctg caa att teg acg gtg acg act acg act acg acg gee gcg atg gaa 478 
Leu Gin He Ser Thr Val Thr Thr Thr Thr Thr Thr Ala Ala Met Glu 
70 75 80 

ggt ate tec ggt gga ctg atg aag aag aag cgt gga egg cca agg aag 526 
Gly He Ser Gly Gly Leu Met Lys Lys Lys Arg Gly Arg Pro Arg Lys 
85 90 95 

tat gga ccg gac ggg act gtt gta gcg tta tct cct aaa ccg att tea 574 
Tyr Gly Pro Asp Gly Thr Val Val Ala Leu Ser Pro Lys Pro He Ser 
100 105 110 

tea gcg ccg gcg ccg teg cat ctt ccg ccg ccg agt tea cac gtc ate 622 
Ser Ala Pro Ala Pro Ser His Leu Pro Pro Pro Ser Ser His Val He 
115 120 125 

gat ttc tec get tct gag aaa cgt age aaa gtg aaa cca acg aac teg 670 
Asp Phe Ser Ala Ser Glu Lys Arg Ser Lys Val Lys Pro Thr ABn Ser 
130 135 140 145 

ttt aac aga aca aag tat cat cac caa gtt gag aat ttg ggt gaa tgg 718 
Phe Asn Arg Thr Lys Tyr His His Gin Val Glu Asn Leu Gly Glu Trp 
150 155 160 

get cct tgc tec gtc ggt ggt aat ttc aca cct cat ata ate aca gtc 766 
Ala Pro Cys Ser Val Gly Gly Asn Phe Thr Pro His He He Thr Val 
165 170 175 

aac acc ggc gag gat gta aca atg aag ata ate teg ttt teg caa caa 814 
Asn Thr Gly Glu Asp Val Thr Met Lys He He Ser Phe Ser Gin Gin 
180 185 190 

gga cct cgc tct att tgt gtt ctg tea gca aac ggt gtt att tea age 862 
Gly Pro Arg Ser He Cys Val Leu Ser Ala Asn Gly Val He Ser Ser 
195 200 205 

gtt aca ctt cgt cag cca gat tec tct ggc ggc aca ttg aca tac gaa 910 
Val Thr Leu Arg Gin Pro Asp Ser Ser Gly Gly Thr Leu Thr Tyr Glu 
210 215 220 225 

ggt egg ttt gag ata tta tea tta tec ggg tea ttc atg cct aat gat 958 
Gly Arg Phe Glu lie Leu Ser Leu Ser Gly Ser Phe Met Pro Asn Asp 
230 235 240 

tea ggc gga aca cga agt aga acg gga gga atg agt gta teg tta gca 1006 
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Ser Gly Gly Thr Arg Ser Arg Thr Gly Gly Met Ser Val Ser Leu Ala 
245 250 255 

agt ccc gat gga cgt gta gta ggc ggt ggc etc gec ggt tta eta gta 1054 
Ser Pro Asp Gly Arg Val Val Gly Gly Gly Leu Ala Gly Leu Leu Val 
260 265 270 

gee gcg agt ccg gtt cag gtg gtt gta gga agt ttt tta gcg ggc act 1102 
Ala Ala Ser Pro Val Gin Val Val Val Gly Ser Phe Leu Ala Gly Thr 
275 280 285 

gac cat caa gat cag aaa ccg aaa aag aac aaa cat gat ttc atg ttg 1150 
Asp His Gin Asp Gin Lys Pro Lys Lys Asn Lys His Asp Phe Met Leu 
290 295 300 305 

teg agt cct acc get gca att cct ate tct agt gca get gat cac egg 1198 
Ser Ser Pro Thr Ala Ala He Pro He Ser Ser Ala Ala Asp His Arg 
310 315 320 

aca ate cat teg gtc teg tct ctt ccg gtc aat aat aat aca tgg cag 1246 
Thr He His Ser Val Ser Ser Leu Pro Val Asn Asn Asn Thr Trp Gin 
325 330 335 

act tct tta get tec gat cca aga aac aag cat acc gat att aat gtc 1294 
Thr Ser Leu Ala Ser Asp Pro Arg Asn Lys His Thr Asp He Asn Val 
340 345 350 

aat gta act tga aatccaatct ttctctgtat tttctgttaa caagtttgat 1346 
Asn Val Thr 
355 

ttggttgttt atctacatta ggattttact aaaatggtag tattatttat agggttttag 1406 
ggtctttatt ttggttccac tgttgtcact tgtaggata 1445 

<210> 40 
<211> 356 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 40 

Met Val Leu Asn Met Glu Ser Thr Gly Glu Ala Val Arg Ser Thr Thr 
15 10 15 

Gly Asn Asp Gly Gly He Thr Val Val Arg Ser Asp Ala Pro Ser Asp 
20 25 30 

Phe His Val Ala Gin Arg Ser Glu Ser Ser Asn Gin Ser Pro Thr Ser 
35 ~ 40 45 

Val Thr Pro Pro Pro Pro Gin Pro Ser Ser His His Thr Ala Pro Pro 
50 55 60 

Pro Leu Gin He Ser Thr Val Thr Thr Thr Thr Thr Thr Ala Ala Met 
65 70 75 80 

Glu Gly He Ser Gly Gly Leu Met Lys Lys Lys Arg Gly Arg Pro Arg 
85 90 95 

Lys Tyr Gly Pro Asp Gly Thr Val Val Ala Leu Ser Pro Lys Pro He 
100 105 110 



Ser Ser Ala Pro Ala Pro Ser His Leu Pro Pro Pro Ser Ser His Val 
115 120 125 
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He Asp Phe Ser Ala Ser Glu Lys Arg Ser Lys Val Lys Pro Thr Asn 
130 135 140 

Ser Phe Asn Arg Thr Lys Tyr His His Gin Val Glu Asn Leu Gly Glu 
145 150 155 160 

Trp Ala Pro Cys Ser Val Gly Gly Asn Phe Thr Pro His He He Thr 
165 170 175 

Val Asn Thr Gly Glu Asp Val Thr Met Lys He He Ser Phe Ser Gin 
180 185 190 

Gin Gly Pro Arg Ser He Cys Val Leu Ser Ala Asn Gly Val He Ser 
195 200 205 

Ser Val Thr Leu Arg Gin Pro Asp Ser Ser Gly Gly Thr Leu Thr Tyr 
210 215 220 

Glu Gly Arg Phe Glu He Leu Ser Leu Ser Gly Ser Phe Met Pro Asn 
225 230 235 240 

Asp Ser Gly Gly Thr Arg Ser Arg Thr Gly Gly Met Ser Val Ser Leu 
245 250 255 

Ala Ser Pro Asp Gly Arg Val Val Gly Gly Gly Leu Ala Gly Leu Leu 
260 265 270 

Val Ala Ala Ser Pro Val Gin Val Val Val Gly Ser Phe Leu Ala Gly 
275 280 285 

Thr Asp His Gin Asp Gin Lys Pro Lys Lys Asn Lys His Asp Phe Met 
290 295 300 

Leu Ser Ser Pro Thr Ala Ala He Pro He Ser Ser Ala Ala Asp His 
305 310 315 320 

Arg Thr He His Ser Val Ser Ser Leu Pro Val Asn Asn Asn Thr Trp 
325 330 335 

Gin Thr Ser Leu Ala Ser Asp Pro Arg Asn Lys His Thr Asp He Asn 
340 345 350 

Val Asn Val Thr 
355 

<210> 41 

<211> 1558 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (191) (1396) 

<223> G326 

<400> 41 

caattaatga catcttcttc ttctcctttc actgcaaaac cgaaagcttg agactttgag 60 

attatgtcta tgtcatcttc ttcttcttcc atcgatcact tcatcacctt tcgtcatctt 120 
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gatcttattc tccactgtat aaaatcagcg agattttaag ggattgtgaa ggtaccatct 180 

taaacacaaa atg ggt act tct act aca gag agt gtg gtg gcg tgt gaa 229 
Met Gly Thr Ser Thr Thr Glu Ser Val Val Ala Cys Glu 
15 10 

ttt tgc ggc gag aga acg gcg gtt ctg ttt tgt aga gcc gat acg gcg 277 
Phe Cys Gly Glu Arg Thr Ala Val Leu Phe Cys Arg Ala Asp Thr Ala 
15 20 25 

aag ctt tgt ttg cct tgt gac cag cac gtg cac teg gcg aac ctt etc 325 
Lys Leu Cys Leu Pro Cys Asp Gin His Val His Ser Ala Asn Leu Leu 
30 * 35 40 45 

teg agg aag cat gtt cgt tct cag ate tgt gat aac tgt age aaa gag 373 
Ser Arg Lys His Val Arg Ser Gin He Cys Asp Asn Cys Ser Lys Glu 
50 55 60 

ccg gtg tec gta cgt tgc ttc aca gat aat etc gta ttg tgt cag gag 421 
Pro Val Ser Val Arg Cys Phe Thr Asp Asn Leu Val Leu Cys Gin Glu 
65 70 75 

tgt gat tgg gat gtt cac gga age tgt tec tec tec gcg acg cat gaa 469 
Cys Asp Trp Asp Val His Gly Ser Cys Ser Ser Ser Ala Thr His Glu 
80 85 90 

cgc tec gcc gtg gaa ggg ttt tea ggt tgt cct teg gtt ttg gag ctt 517 
Arg Ser Ala Val Glu Gly Phe Ser Gly Cys Pro Ser Val Leu Glu Leu 
95 100 105 

get get gtg tgg gga ate gat tta aag ggt aag aag aaa gaa gat gac 565 
Ala Ala Val Trp Gly He Asp Leu Lys Gly Lys Lys Lys Glu Asp Asp 
110 115 120 125 

gaa gac gaa ttg act aag aat ttt ggg atg ggg ttg gat teg tgg ggt 613 
Glu Asp Glu Leu Thr Lys Asn Phe Gly Met Gly Leu Asp Ser Trp Gly 
130 135 140 

tct gga tct aac ate gtt caa gaa ctg att gtt cct tat gat gtg tct 661 
Ser Gly Ser Asn He Val Gin Glu Leu He Val Pro Tyr Asp Val Ser 
145 % 150 155 

tgc aaa aag caa age ttt age ttt ggg agg tct aag cag gta gtg ttt 709 
Cys Lys Lys Gin Ser Phe Ser Phe Gly Arg Ser Lys Gin Val Val Phe 
160 165 170 

gaa cag ctt gag tta ctg aag aga ggc ttc gtt gaa ggc gaa gga gag 757 
Glu Gin Leu Glu Leu Leu Lys Arg Gly Phe Val Glu Gly Glu Gly Glu 
175 180 185 

att atg gtt ccg gag gga ate aat ggc gga gga age att tct cag cca 805 
He Met Val Pro Glu Gly He Asn Gly Gly Gly Ser He Ser Gin Pro 
190 195 200 205 

tct ccg acg acg teg ttt act tct ttg ctt atg tct caa agt ctt tgt 853 
Ser Pro Thr Thr Ser Phe Thr Ser Leu Leu Met Ser Gin Ser Leu Cys 
210 215 220 

ggt aat ggt atg caa tgg aat get act aat cat age act ggc cag aac 901 
Gly Asn Gly Met Gin Trp Asn Ala Thr Asn His Ser Thr Gly Gin Asn 
225 230 235 

act cag ata tgg gat ttt aac ttg gga cag teg agg aac cct gat gaa 94 9 

Thr Gin He Trp Asp Phe Asn Leu Gly Gin Ser Arg Asn Pro Asp Glu 
240 245 250 

cct agt cca gtc gaa act aaa ggc tct act ttc aca ttc aac aac gtt 997 
Pro Ser Pro Val Glu Thr Lys Gly Ser Thr Phe Thr Phe Asn Asn Val 
255 260 265 

act cat etc aag aac gat ace cga acc acc aat atg aat get ttc aaa 1045 
Thr His Leu Lys Asn Asp Thr Arg Thr Thr Asn Met Asn Ala Phe Lys 
270 275 280 285 
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gag agt tac cag gag gat tec gtc cac tea act tct acc aag gga cag 1093 
Glu Ser Tyr Gin Glu Asp Ser Val His Ser Thr Ser Thr Lys Gly Gin 
290 295 300 

gaa aca tct aag age aac aat att cct get gee att cac teg cat aaa 1141 
Glu Thr Ser Lys Ser Asn Asn He Pro Ala Ala lie His Ser His Lys 
305 310 315 

agt tct aac gac tec tgt ggc ttg cat tgc acg gaa cat att get att 1189 
Ser Ser Asn Asp Ser Cys Gly Leu His Cys Thr Glu His He Ala He 
320 325 330 

act agt aat aga gee aca aga ttg gtg gcg gta acg aat get gat eta 1237 
Thr Ser Asn Arg Ala Thr Arg Leu Val Ala Val Thr Asn Ala Asp Leu 
335 340 345 

gag cag atg gca cag aac aga gat aat get atg cag egg tac aag gaa 1285 
Glu Gin Met Ala Gin Asn Arg Asp Asn Ala Met Gin Arg Tyr Lys Glu 
350 355 360 365 

aag aag aaa acg egg aga tat gat aag acc ata aga tat gaa acg agg 1333 
Lys Lys Lys Thr Arg Arg Tyr Asp Lys Thr He Arg Tyr Glu Thr Arg 
370 375 " ** 380 

aag gcg aga gee gag acc agg ttg cgt gtt aag ggc aga ttt gtg aaa 1381 
Lys Ala Arg Ala Glu Thr Arg Leu Arg Val Lys Gly Arg Phe Val Lys 
385 390 395 

get aca gat cct tag atgtctctcc acgttaggtt ttacatttga gatcctaagt 1436 
Ala Thr Asp Pro 
400 

taggaacttt ttttgttttt tctactttca actaccttgt aaatgtaaat gatcgatctt 1496 

cagctgeata atgtgtggcc agatttttgt aatttttacg tttaaccttc taaaaaaaaa 1556 

aa 1558 

<210> 42 

<211> 401 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 42 

Met Gly Thr Ser Thr Thr Glu Ser Val Val Ala Cys Glu Phe Cys Gly 
1 5 10 15 

Glu Arg Thr Ala Val Leu Phe Cys Arg Ala Asp Thr Ala Lys Leu Cys 
20 25 30 

Leu Pro Cys Asp Gin His Val His Ser Ala Asn Leu Leu Ser Arg Lys 
35 40 45 

His Val Arg Ser Gin lie Cys Asp Asn Cys Ser Lys Glu Pro Val Ser 
50 55 60 

Val Arg Cys Phe Thr Asp Asn Leu Val Leu Cys Gin Glu Cys Asp Trp 
65 70 75 80 

Asp Val His Gly Ser Cys Ser Ser Ser Ala Thr His Glu Arg Ser Ala 
85 90 95 

Val Glu Gly Phe Ser Gly Cys Pro Ser Val Leu Glu Leu Ala Ala Val 
100 105 110 



Trp Gly He Asp Leu Lys Gly Lys Lys Lys Glu Asp Asp Glu Asp Glu 
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115 120 125 

Leu Thr Lys Asn Phe Gly Met Gly Leu Asp Ser Trp Gly Ser Gly Ser 
130 135 140 

Asn He Val Gin Glu Leu He Val Pro Tyr Asp Val Ser Cys Lys Lys 
145 150 155 160 

Gin Ser Phe Ser Phe Gly Arg Ser Lys Gin Val Val Phe Glu Gin Leu 
165 170 175 

Glu Leu Leu Lys Arg Gly Phe Val Glu Gly Glu Gly Glu He Met Val 
180 185 190 

Pro Glu Gly He Asn Gly Gly Gly Ser He Ser Gin Pro Ser Pro Thr 
195 200 205 

Thr Ser Phe Thr Ser Leu Leu Met Ser Gin Ser Leu Cys Gly Asn Gly 
210 215 220 

Met Gin Trp Asn Ala Thr Asn His Ser Thr Gly Gin Asn Thr Gin He 
225 230 235 240 

Trp Asp Phe Asn Leu Gly Gin Ser Arg Asn Pro Asp Glu Pro Ser Pro 
245 250 255 

Val Glu Thr Lys Gly Ser Thr Phe Thr Phe Asn Asn Val Thr His Leu 
260 265 270 



Lys Asn Asp Thr Arg Thr Thr Asn Met Asn Ala Phe Lys Glu Ser Tyr 
275 280 285 

Gin Glu Asp Ser Val His Ser Thr Ser Thr Lys Gly Gin Glu Thr Ser 
290 295 - 300 



Lys Ser Asn Asn He Pro Ala Ala He His Ser His Lys Ser Ser Asn 
305 310 315 320 

Asp Ser Cys Gly Leu His Cys Thr Glu His He Ala He Thr Ser Asn 
325 330 335 

Arg Ala Thr Arg Leu Val Ala Val Thr Asn Ala Asp Leu Glu Gin Met 
340 345 350 

Ala Gin Asn Arg Asp Asn Ala Met Gin Arg Tyr Lys Glu Lys Lys Lys 
355 360 365 

Thr Arg Arg Tyr Asp Lys Thr He Arg Tyr Glu Thr Arg Lys Ala Arg 
370 375 380 

Ala Glu Thr Arg Leu Arg Val Lys Gly Arg Phe Val Lys Ala Thr Asp 
385 390 395 400 



Pro 
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<210> 43 

<211> 844 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (89).. (658) 

<223> G1387 

<400> 43 

tctctctccc actctcactt tctctcctat tcttagttcg tgtcagaaac acacagagaa 60 

attaagaacc ctaatttaaa acagaaga atg gta cat teg aag aag ttc cga 112 

Met Val His Ser Lys Lys Phe Arg 
1 5 

ggt gtc cgc cag cgt cag tgg ggt tct tgg gtt tct gag att cgt cat 160 
Gly Val Arg Gin Arg Gin Trp Gly Ser Trp Val Ser Glu lie Arg His 
10 15 20 

cct etc ttg aag aga aga gtg tgg eta gga aca ttc gac acg gcg gaa 208 
Pro Leu Leu Lys Arg Arg Val Trp Leu Gly Thr Phe Asp Thr Ala Glu 
25 30 35 . 40 

aca gcg get aga gee tac gac caa gee gcg gtt eta atg aac ggc cag 256 
Thr Ala Ala Arg Ala Tyr Asp Gin Ala Ala Val Leu Met Asn Gly Gin 
45 50 55 

age gcg aag act aac ttc ccc gtc ate aaa teg aac ggt tea aat tec 304 
Ser Ala Lys Thr Asn Phe Pro Val lie Lys Ser Asn Gly Ser Asn Ser 
60 65 70 

ttg gag att aac tct gcg tta agg tct ccc aaa tea tta teg gaa eta 352 
Leu Glu lie Asn Ser Ala Leu Arg Ser Pro Lys Ser Leu Ser Glu Leu 
75 80 85 

ttg aac get aag eta agg aag aac tgt aaa gac cag aca ccg tat ctg 400 
Leu Asn Ala Lys Leu Arg Lys Asn Cys Lys Asp Gin Thr Pro Tyr Leu 
90 95 * 100 

acg tgt etc cgc etc gac aac gac age tea cac ate ggc gtc tgg cag 448 
Thr Cys Leu Arg Leu Asp Asn Asp Ser Ser His lie Gly Val Trp Gin 
105 110 115 120 

aaa cgc gee ggg tea aaa acg agt cca aac tgg gtc aag ctt gtt gaa 4 96 

Lys Arg Ala Gly Ser Lys Thr Ser Pro Asn Trp Val Lys Leu Val Glu 
125 ** 130 135 

eta ggt gac aaa gtt aac gca cgt ccc ggt ggt gat att gag act aat 544 
Leu Gly Asp Lys Val Asn Ala Arg Pro Gly Gly Asp lie Glu Thr Asn 
140 145 150 

aag atg aag gta cga aac gaa gac gtt cag gaa gat gat caa atg gcg 592 
Lys Met Lys Val Arg Asn Glu Asp Val Gin Glu Asp Asp Gin Met Ala 
155 160 165 

atg cag atg ate gag gag ttg ctt aac tgg ace tgt cct gga tct gga 640 
Met Gin Met lie Glu Glu Leu Leu Asn Trp Thr Cys Pro Gly Ser Gly 
170 175 180 

tec att gca cag gtc taa aggagaatca ttgaattata tgatcaagat 688 

Ser He Ala Gin Val 

185 

aataatatag ttgagggtta ataataatcg agggtaagta atttacgtgt agctaataat 748 

taatataatt ttcgaacata tatatgaata tatgatagct ctagaaatga gtaegtatat 808 

ataegtaaac atttttcctc aaatatagta tatgtg 844 

<210> 44 
<211> 189 
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<212> PRT 

<213> Arabidopsis thaliana 
<400> 44 

Met Val His Ser Lys Lys Phe Arg Gly Val Arg Gin Arg Gin Trp Gly 
15 10 15 

Ser Trp Val Ser Glu lie Arg His Pro Leu Leu Lys Arg Arg Val Trp 
20 25 30 



Leu Gly Thr Phe Asp Thr Ala Glu Thr Ala Ala Arg Ala Tyr Asp Gin 
35 40 45 

Ala Ala Val Leu Met Asn Gly Gin Ser Ala Lys Thr Asn Phe Pro Val 
50 55 60 

He Lys Ser Asn Gly Ser Asn Ser Leu Glu He Asn Ser Ala Leu Arg 
65 70 75 80 



Ser Pro Lys Ser Leu Ser Glu Leu Leu Asn Ala Lys Leu Arg Lys Asn 
85 90 95 



Cys Lys Asp Gin Thr Pro Tyr Leu Thr Cys Leu Arg Leu Asp Asn Asp 
100 105 110 

Ser Ser His He Gly Val Trp Gin Lys Arg Ala Gly Ser Lys Thr Ser 
115 120 125 

Pro Asn Trp Val Lys Leu Val Glu Leu Gly Asp Lys Val Asn Ala Arg 
130 135 140 

Pro Gly Gly Asp He Glu Thr Asn Lys Met Lys Val Arg Asn Glu Asp 
145 150 155 160 

Val Gin Glu Asp Asp Gin Met Ala Met Gin Met He Glu Glu Leu Leu 
165 170 175 



Asn Trp Thr Cys Pro Gly Ser Gly Ser He Ala Gin Val 
180 185 
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